I'am using nomad on GCE and I cannot pull docker images from the public registry.
I can do a pull form the command line with docker pull gerlacdt/helloapp:v0.1.0
But when trying to run a nomad job with a public registry image, I have this error:
Failed to find docker auth for repo "gerlacdt/helloapp": docker-credential-gcr
Relevant files :
The /root/.docker/config.json file:
{
"auths": {
"https://index.docker.io/v1/": {}
},
"credHelpers": {
"asia.gcr.io": "gcr",
"eu.gcr.io": "gcr",
"gcr.io": "gcr",
"staging-k8s.gcr.io": "gcr",
"us.gcr.io": "gcr"
}
}
The nomad client config:
datacenter = "europe-west1-c"
name = "consul-clients-092s"
region = "europe-west1"
bind_addr = "0.0.0.0"
advertise {
http = "172.27.3.132"
rpc = "172.27.3.132"
serf = "172.27.3.132"
}
client {
enabled = true
options = {
"docker.auth.config" = "/root/.docker/config.json"
"docker.auth.helper" = "gcr"
}
}
consul {
address = "127.0.0.1:8500"
}
The job file:
job "helloapp" {
datacenters = ["europe-west1-b", "europe-west1-c", "europe-west1-d"]
constraint {
attribute = "${attr.kernel.name}"
value = "linux"
}
# Configure the job to do rolling updates
update {
stagger = "10s"
max_parallel = 1
}
group "hello" {
count = 1
restart {
attempts = 2
interval = "1m"
delay = "10s"
mode = "fail"
}
# Define a task to run
task "hello" {
driver = "docker"
config {
image = "gerlacdt/helloapp:v0.1.0"
port_map {
http = 8080
}
}
service {
name = "${TASKGROUP}-service"
tags = [
# "traefik.tags=public",
"traefik.frontend.rule=Host:bla.zapto.org",
"traefik.frontend.entryPoints=http",
"traefik.tags=exposed"
]
port = "http"
check {
name = "alive"
type = "http"
interval = "10s"
timeout = "3s"
path = "/health"
}
}
resources {
cpu = 500 # 500 MHz
memory = 128 # 128MB
network {
mbits = 1
port "http" {
}
}
}
logs {
max_files = 10
max_file_size = 15
}
kill_timeout = "10s"
}
}
}
The complete error message from nomad client logs:
failed to initialize task "hello" for alloc "c845bdb9-500a-dc40-0f17-2b79fe4866f1": Failed to find docker auth for repo "gerlacdt/helloapp": docker-credential-gcr with input "gerlacdt/helloapp" failed with stderr:
Related
I can't get TLS to work. The CertficateRequest gets created, the Order too and also the Challenge. However, the Challenge is stuck in pending.
Name: test-tls-secret-8qshd-3608253913-1269058669
Namespace: test
Labels: <none>
Annotations: <none>
API Version: acme.cert-manager.io/v1
Kind: Challenge
Metadata:
Creation Timestamp: 2022-07-19T08:17:04Z
Finalizers:
finalizer.acme.cert-manager.io
Generation: 1
Managed Fields:
API Version: acme.cert-manager.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.:
v:"finalizer.acme.cert-manager.io":
Manager: cert-manager-challenges
Operation: Update
Time: 2022-07-19T08:17:04Z
API Version: acme.cert-manager.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:ownerReferences:
.:
k:{"uid":"06029d3f-d1ce-45db-a267-796ff9b82a67"}:
f:spec:
.:
f:authorizationURL:
f:dnsName:
f:issuerRef:
.:
f:group:
f:kind:
f:name:
f:key:
f:solver:
.:
f:dns01:
.:
f:azureDNS:
.:
f:environment:
f:hostedZoneName:
f:resourceGroupName:
f:subscriptionID:
f:token:
f:type:
f:url:
f:wildcard:
Manager: cert-manager-orders
Operation: Update
Time: 2022-07-19T08:17:04Z
API Version: acme.cert-manager.io/v1
Fields Type: FieldsV1
fieldsV1:
f:status:
.:
f:presented:
f:processing:
f:reason:
f:state:
Manager: cert-manager-challenges
Operation: Update
Subresource: status
Time: 2022-07-19T08:25:38Z
Owner References:
API Version: acme.cert-manager.io/v1
Block Owner Deletion: true
Controller: true
Kind: Order
Name: test-tls-secret-8qshd-3608253913
UID: 06029d3f-d1ce-45db-a267-796ff9b82a67
Resource Version: 4528159
UID: 9594ed48-72c6-4403-8356-4991950fe9bb
Spec:
Authorization URL: https://acme-v02.api.letsencrypt.org/acme/authz-v3/131873811576
Dns Name: test.internal.<company_id>.com
Issuer Ref:
Group: cert-manager.io
Kind: ClusterIssuer
Name: letsencrypt
Key: xrnhZETWbkGTE7CA0A3CQd6a48d4JG4HKDiCXPpxTWM
Solver:
dns01:
Azure DNS:
Environment: AzurePublicCloud
Hosted Zone Name: internal.<company_id>.com
Resource Group Name: tool-cluster-rg
Subscription ID: <subscription_id>
Token: jXCR2UorNanlHqZd8T7Ifjbx6PuGfLBwnzWzBnDvCyc
Type: DNS-01
URL: https://acme-v02.api.letsencrypt.org/acme/chall-v3/131873811576/vCGdog
Wildcard: false
Status:
Presented: false
Processing: true
Reason: azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/<subscription_id>/resourceGroups/tool-cluster-rg/providers/Microsoft.Network/dnsZones/internal.<company_id>.com/TXT/_acme-challenge.test?api-version=2017-10-01: StatusCode=404 -- Original Error: adal: Refresh request failed. Status Code = '404'. Response body: getting assigned identities for pod cert-manager/cert-manager-5bb7949947-qlg5j in CREATED state failed after 16 attempts, retry duration [5]s, error: <nil>. Check MIC pod logs for identity assignment errors
Endpoint http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fmanagement.core.windows.net%2F
State: pending
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Started 59m cert-manager-challenges Challenge scheduled for processing
Warning PresentError 11s (x7 over 51m) cert-manager-challenges Error presenting challenge: azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/<subscription_id>/resourceGroups/tool-cluster-rg/providers/Microsoft.Network/dnsZones/internal.<company_id>.com/TXT/_acme-challenge.test?api-version=2017-10-01: StatusCode=404 -- Original Error: adal: Refresh request failed. Status Code = '404'. Response body: getting assigned identities for pod cert-manager/cert-manager-5bb7949947-qlg5j in CREATED state failed after 16 attempts, retry duration [5]s, error: <nil>. Check MIC pod logs for identity assignment errors
Endpoint http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fmanagement.core.windows.net%2F
It says to check the MIC pod logs, however, there are no errors logged:
I0719 08:16:52.271516 1 mic.go:587] pod test/test-deployment-b5dcc75f4-5gdtj has no assigned node yet. it will be ignored
I0719 08:16:52.284362 1 mic.go:608] No AzureIdentityBinding found for pod test/test-deployment-b5dcc75f4-5gdtj that matches selector: certman-label. it will be ignored
I0719 08:16:53.735678 1 mic.go:648] certman-identity identity not found when using test/certman-id-binding binding
I0719 08:16:53.737027 1 mic.go:1040] processing node aks-default-10282586-vmss, add [1], del [0], update [0]
I0719 08:16:53.737061 1 crd.go:514] creating assigned id test/test-deployment-b5dcc75f4-5gdtj-test-certman-identity
I0719 08:16:53.844892 1 cloudprovider.go:210] updating user-assigned identities on aks-default-10282586-vmss, assign [1], unassign [0]
I0719 08:17:04.545556 1 crd.go:777] updating AzureAssignedIdentity test/test-deployment-b5dcc75f4-5gdtj-test-certman-identity status to Assigned
I0719 08:17:04.564464 1 mic.go:525] work done: true. Found 1 pods, 1 ids, 1 bindings
I0719 08:17:04.564477 1 mic.go:526] total work cycles: 392, out of which work was done in: 320
I0719 08:17:04.564492 1 stats.go:183] ** stats collected **
I0719 08:17:04.564497 1 stats.go:162] Pod listing: 20.95µs
I0719 08:17:04.564504 1 stats.go:162] AzureIdentity listing: 2.357µs
I0719 08:17:04.564508 1 stats.go:162] AzureIdentityBinding listing: 3.211µs
I0719 08:17:04.564512 1 stats.go:162] AzureAssignedIdentity listing: 431ns
I0719 08:17:04.564516 1 stats.go:162] System: 71.101µs
I0719 08:17:04.564520 1 stats.go:162] CacheSync: 4.482µs
I0719 08:17:04.564523 1 stats.go:162] Cloud provider GET: 83.123547ms
I0719 08:17:04.564527 1 stats.go:162] Cloud provider PATCH: 10.700611864s
I0719 08:17:04.564531 1 stats.go:162] AzureAssignedIdentity creation: 24.654916ms
I0719 08:17:04.564535 1 stats.go:162] AzureAssignedIdentity update: 0s
I0719 08:17:04.564538 1 stats.go:162] AzureAssignedIdentity deletion: 0s
I0719 08:17:04.564542 1 stats.go:170] Number of cloud provider PATCH: 1
I0719 08:17:04.564546 1 stats.go:170] Number of cloud provider GET: 1
I0719 08:17:04.564549 1 stats.go:170] Number of AzureAssignedIdentities created in this sync cycle: 1
I0719 08:17:04.564554 1 stats.go:170] Number of AzureAssignedIdentities updated in this sync cycle: 0
I0719 08:17:04.564557 1 stats.go:170] Number of AzureAssignedIdentities deleted in this sync cycle: 0
I0719 08:17:04.564561 1 stats.go:162] Find AzureAssignedIdentities to create: 0s
I0719 08:17:04.564564 1 stats.go:162] Find AzureAssignedIdentities to delete: 0s
I0719 08:17:04.564568 1 stats.go:162] Total time to assign or update AzureAssignedIdentities: 10.827425179s
I0719 08:17:04.564573 1 stats.go:162] Total: 10.82763016s
I0719 08:17:04.564577 1 stats.go:212] *********************
I0719 08:19:34.077484 1 mic.go:1466] reconciling identity assignment for [/subscriptions/<subscription_id>/resourceGroups/tool-cluster-rg/providers/Microsoft.ManagedIdentity/userAssignedIdentities/cert-manager-dns01] on node aks-default-10282586-vmss
I0719 08:22:34.161195 1 mic.go:1466] reconciling identity assignment for [/subscriptions/<subscription_id>/resourceGroups/tool-cluster-rg/providers/Microsoft.ManagedIdentity/userAssignedIdentities/cert-manager-dns01] on node aks-default-10282586-vmss
The "reconciling identity" output gets repeated afterwards. Up to this point, I was able to handle my way through error messages, but now I have no idea how to proceed. Anyone got any lead what I'm missing?
Following my terraform code for the infrastructure.
terraform {
cloud {
organization = "<company_id>"
workspaces {
name = "tool-cluster"
}
}
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = ">= 3.6.0, < 4.0.0"
}
}
}
provider "azurerm" {
features {}
}
data "azurerm_client_config" "default" {}
variable "id" {
type = string
description = "Company wide unique terraform identifier"
default = "tool-cluster"
}
resource "azurerm_resource_group" "default" {
name = "${var.id}-rg"
location = "westeurope"
}
resource "azurerm_kubernetes_cluster" "default" {
name = "${var.id}-aks"
location = azurerm_resource_group.default.location
resource_group_name = azurerm_resource_group.default.name
dns_prefix = var.id
default_node_pool {
name = "default"
node_count = 1
vm_size = "Standard_D4_v5"
}
identity {
type = "SystemAssigned"
}
role_based_access_control_enabled = true
http_application_routing_enabled = true
}
resource "azurerm_dns_zone" "internal" {
name = "internal.<company_id>.com"
resource_group_name = azurerm_resource_group.default.name
}
resource "azurerm_user_assigned_identity" "dns_identity" {
name = "cert-manager-dns01"
resource_group_name = azurerm_resource_group.default.name
location = azurerm_resource_group.default.location
}
resource "azurerm_role_assignment" "dns_contributor" {
scope = azurerm_dns_zone.internal.id
role_definition_name = "DNS Zone Contributor"
principal_id = azurerm_user_assigned_identity.dns_identity.principal_id
}
I've added the roles "Managed Identity Operator" and "Virtual Machine Contributor" in the scope of the generated resourcegroup of the cluster (MC_tool-cluster-rg_tool-cluster-aks_westeurope) and "Managed Identity Operator" to the resource group of the cluster itself (tool-cluster-rg) to the kubelet_identity.
Code for the cert-manager:
terraform {
cloud {
organization = "<company_id>"
workspaces {
name = "cert-manager"
}
}
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = ">= 2.12.0, < 3.0.0"
}
helm = {
source = "hashicorp/helm"
version = ">= 2.6.0, < 3.0.0"
}
azurerm = {
source = "hashicorp/azurerm"
version = ">= 3.6.0, < 4.0.0"
}
}
}
data "terraform_remote_state" "tool-cluster" {
backend = "remote"
config = {
organization = "<company_id>"
workspaces = {
name = "tool-cluster"
}
}
}
provider "azurerm" {
features {}
}
provider "kubernetes" {
host = data.terraform_remote_state.tool-cluster.outputs.host
client_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_certificate)
client_key = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_key)
cluster_ca_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.cluster_ca_certificate)
}
provider "helm" {
kubernetes {
host = data.terraform_remote_state.tool-cluster.outputs.host
client_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_certificate)
client_key = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_key)
cluster_ca_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.cluster_ca_certificate)
}
}
locals {
app-name = "cert-manager"
}
resource "kubernetes_namespace" "cert_manager" {
metadata {
name = local.app-name
}
}
resource "helm_release" "cert_manager" {
name = local.app-name
repository = "https://charts.jetstack.io"
chart = "cert-manager"
version = "v1.8.2"
namespace = kubernetes_namespace.cert_manager.metadata.0.name
set {
name = "installCRDs"
value = "true"
}
}
resource "helm_release" "aad_pod_identity" {
name = "aad-pod-identity"
repository = "https://raw.githubusercontent.com/Azure/aad-pod-identity/master/charts"
chart = "aad-pod-identity"
version = "v4.1.10"
namespace = kubernetes_namespace.cert_manager.metadata.0.name
}
resource "azurerm_user_assigned_identity" "default" {
name = local.app-name
resource_group_name = data.terraform_remote_state.tool-cluster.outputs.resource_name
location = data.terraform_remote_state.tool-cluster.outputs.resource_location
}
resource "azurerm_role_assignment" "default" {
scope = data.terraform_remote_state.tool-cluster.outputs.dns_zone_id
role_definition_name = "DNS Zone Contributor"
principal_id = azurerm_user_assigned_identity.default.principal_id
}
output "namespace" {
value = kubernetes_namespace.cert_manager.metadata.0.name
sensitive = false
}
and the code for my issuer:
terraform {
cloud {
organization = "<company_id>"
workspaces {
name = "cert-issuer"
}
}
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = ">= 2.12.0, < 3.0.0"
}
helm = {
source = "hashicorp/helm"
version = ">= 2.6.0, < 3.0.0"
}
azurerm = {
source = "hashicorp/azurerm"
version = ">= 3.6.0, < 4.0.0"
}
}
}
data "terraform_remote_state" "tool-cluster" {
backend = "remote"
config = {
organization = "<company_id>"
workspaces = {
name = "tool-cluster"
}
}
}
data "terraform_remote_state" "cert-manager" {
backend = "remote"
config = {
organization = "<company_id>"
workspaces = {
name = "cert-manager"
}
}
}
provider "azurerm" {
features {}
}
provider "kubernetes" {
host = data.terraform_remote_state.tool-cluster.outputs.host
client_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_certificate)
client_key = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_key)
cluster_ca_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.cluster_ca_certificate)
}
provider "helm" {
kubernetes {
host = data.terraform_remote_state.tool-cluster.outputs.host
client_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_certificate)
client_key = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_key)
cluster_ca_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.cluster_ca_certificate)
}
}
locals {
app-name = "cert-manager"
}
data "azurerm_subscription" "current" {}
resource "kubernetes_manifest" "cluster_issuer" {
manifest = yamldecode(templatefile(
"${path.module}/cluster-issuer.tpl.yaml",
{
"name" = "letsencrypt"
"subscription_id" = data.azurerm_subscription.current.subscription_id
"resource_group_name" = data.terraform_remote_state.tool-cluster.outputs.resource_name
"dns_zone_name" = data.terraform_remote_state.tool-cluster.outputs.dns_zone_name
}
))
}
Also, the yaml:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: ${name}
spec:
acme:
email: support#<company_id>.com
server: https://acme-v02.api.letsencrypt.org/directory
privateKeySecretRef:
name: ${name}
solvers:
- dns01:
azureDNS:
resourceGroupName: ${resource_group_name}
subscriptionID: ${subscription_id}
hostedZoneName: ${dns_zone_name}
environment: AzurePublicCloud
Finally, my sample app:
terraform {
cloud {
organization = "<company_id>"
workspaces {
name = "test-web-app"
}
}
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = ">= 2.12.0, < 3.0.0"
}
azurerm = {
source = "hashicorp/azurerm"
version = ">= 3.6.0, < 4.0.0"
}
azuread = {
source = "hashicorp/azuread"
version = ">= 2.26.0, < 3.0.0"
}
}
}
data "terraform_remote_state" "tool-cluster" {
backend = "remote"
config = {
organization = "<company_id>"
workspaces = {
name = "tool-cluster"
}
}
}
provider "azuread" {}
provider "azurerm" {
features {}
}
provider "kubernetes" {
host = data.terraform_remote_state.tool-cluster.outputs.host
client_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_certificate)
client_key = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_key)
cluster_ca_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.cluster_ca_certificate)
}
locals {
app-name = "test"
host = "test.${data.terraform_remote_state.tool-cluster.outputs.cluster_domain_name}"
}
resource "azurerm_dns_cname_record" "default" {
name = local.app-name
zone_name = data.terraform_remote_state.tool-cluster.outputs.dns_zone_name
resource_group_name = data.terraform_remote_state.tool-cluster.outputs.resource_name
ttl = 300
record = local.host
}
resource "azuread_application" "default" {
display_name = local.app-name
}
resource "kubernetes_namespace" "default" {
metadata {
name = local.app-name
}
}
resource "kubernetes_secret" "auth" {
metadata {
name = "basic-auth"
namespace = kubernetes_namespace.default.metadata.0.name
}
data = {
"auth" = file("./auth")
}
}
resource "kubernetes_deployment" "default" {
metadata {
name = "${local.app-name}-deployment"
namespace = kubernetes_namespace.default.metadata.0.name
labels = {
app = local.app-name
}
}
spec {
replicas = 1
selector {
match_labels = {
app = local.app-name
}
}
template {
metadata {
labels = {
app = local.app-name
aadpodidbinding = "certman-label"
}
}
spec {
container {
image = "crccheck/hello-world:latest"
name = local.app-name
port {
container_port = 8000
host_port = 8000
}
}
}
}
}
}
resource "kubernetes_service" "default" {
metadata {
name = "${local.app-name}-svc"
namespace = kubernetes_namespace.default.metadata.0.name
}
spec {
selector = {
app = kubernetes_deployment.default.metadata.0.labels.app
}
port {
port = 8000
target_port = 8000
}
}
}
resource "kubernetes_ingress_v1" "default" {
metadata {
name = "${local.app-name}-ing"
namespace = kubernetes_namespace.default.metadata.0.name
annotations = {
"kubernetes.io/ingress.class" = "addon-http-application-routing"
"cert-manager.io/cluster-issuer" = "letsencrypt"
# basic-auth
"nginx.ingress.kubernetes.io/auth-type" = "basic"
"nginx.ingress.kubernetes.io/auth-secret" = "basic-auth"
"nginx.ingress.kubernetes.io/auth-realm" = "Authentication Required - foo"
}
}
spec {
rule {
host = local.host
http {
path {
path = "/"
backend {
service {
name = kubernetes_service.default.metadata.0.name
port {
number = 8000
}
}
}
}
}
}
rule {
host = trimsuffix(azurerm_dns_cname_record.default.fqdn, ".")
http {
path {
path = "/"
backend {
service {
name = kubernetes_service.default.metadata.0.name
port {
number = 8000
}
}
}
}
}
}
tls {
hosts = [ trimsuffix(azurerm_dns_cname_record.default.fqdn, ".") ]
secret_name = "${local.app-name}-tls-secret"
}
}
}
resource "kubernetes_manifest" "azure_identity" {
manifest = yamldecode(templatefile(
"${path.module}/azure_identity.tpl.yaml",
{
"namespace" = kubernetes_namespace.default.metadata.0.name
"resource_id" = data.terraform_remote_state.tool-cluster.outputs.identity_resource_id
"client_id" = data.terraform_remote_state.tool-cluster.outputs.identity_client_id
}
))
}
resource "kubernetes_manifest" "azure_identity_binding" {
manifest = yamldecode(templatefile(
"${path.module}/azure_identity_binding.tpl.yaml",
{
"namespace" = kubernetes_namespace.default.metadata.0.name
"resource_id" = data.terraform_remote_state.tool-cluster.outputs.identity_resource_id
"client_id" = data.terraform_remote_state.tool-cluster.outputs.identity_client_id
}
))
}
The two identity yaml:
apiVersion: "aadpodidentity.k8s.io/v1"
kind: AzureIdentity
metadata:
annotations:
# recommended to use namespaced identites https://azure.github.io/aad-pod-identity/docs/configure/match_pods_in_namespace/
aadpodidentity.k8s.io/Behavior: namespaced
name: certman-identity
namespace: ${namespace} # change to your preferred namespace
spec:
type: 0 # MSI
resourceID: ${resource_id} # Resource Id From Previous step
clientID: ${client_id} # Client Id from previous step
and
apiVersion: "aadpodidentity.k8s.io/v1"
kind: AzureIdentityBinding
metadata:
name: certman-id-binding
namespace: ${namespace} # change to your preferred namespace
spec:
azureIdentity: certman-identity
selector: certman-label # This is the label that needs to be set on cert-manager pods
edit: reformatted
I was not able to solve it with http application routing, so I installed my own ingress and instead of aad-pod-identity I installed ExternalDNS with Service Principal. The terraform code for that:
locals {
app-name = "external-dns"
}
resource "azuread_application" "dns" {
display_name = "dns-service_principal"
}
resource "azuread_application_password" "dns" {
application_object_id = azuread_application.dns.object_id
}
resource "azuread_service_principal" "dns" {
application_id = azuread_application.dns.application_id
description = "Service Principal to write DNS changes for ${data.terraform_remote_state.tool-cluster.outputs.dns_zone_name}"
}
resource "azurerm_role_assignment" "dns_zone_contributor" {
scope = data.terraform_remote_state.tool-cluster.outputs.dns_zone_id
role_definition_name = "DNS Zone Contributor"
principal_id = azuread_service_principal.dns.id
}
resource "azurerm_role_assignment" "rg_reader" {
scope = data.terraform_remote_state.tool-cluster.outputs.dns_zone_id
role_definition_name = "Reader"
principal_id = azuread_service_principal.dns.id
}
resource "kubernetes_secret" "external_dns_secret" {
metadata {
name = "azure-config-file"
}
data = { "azure.json" = jsonencode({
tenantId = data.azurerm_subscription.default.tenant_id
subscriptionId = data.azurerm_subscription.default.subscription_id
resourceGroup = data.terraform_remote_state.tool-cluster.outputs.resource_name
aadClientId = azuread_application.dns.application_id
aadClientSecret = azuread_application_password.dns.value
})
}
}
resource "kubernetes_service_account" "dns" {
metadata {
name = local.app-name
}
}
resource "kubernetes_cluster_role" "dns" {
metadata {
name = local.app-name
}
rule {
api_groups = [ "" ]
resources = [ "services","endpoints","pods", "nodes" ]
verbs = [ "get","watch","list" ]
}
rule {
api_groups = [ "extensions","networking.k8s.io" ]
resources = [ "ingresses" ]
verbs = [ "get","watch","list" ]
}
}
resource "kubernetes_cluster_role_binding" "dns" {
metadata {
name = "${local.app-name}-viewer"
}
role_ref {
api_group = "rbac.authorization.k8s.io"
kind = "ClusterRole"
name = kubernetes_cluster_role.dns.metadata.0.name
}
subject {
kind = "ServiceAccount"
name = kubernetes_service_account.dns.metadata.0.name
}
}
resource "kubernetes_deployment" "dns" {
metadata {
name = local.app-name
}
spec {
strategy {
type = "Recreate"
}
selector {
match_labels = {
"app" = local.app-name
}
}
template {
metadata {
labels = {
"app" = local.app-name
}
}
spec {
service_account_name = kubernetes_service_account.dns.metadata.0.name
container {
name = local.app-name
image = "bitnami/external-dns:0.12.1"
args = [ "--source=service", "--source=ingress", "--provider=azure", "--txt-prefix=externaldns-" ]
volume_mount {
name = kubernetes_secret.external_dns_secret.metadata.0.name
mount_path = "/etc/kubernetes"
read_only = true
}
}
volume {
name = kubernetes_secret.external_dns_secret.metadata.0.name
secret {
secret_name = kubernetes_secret.external_dns_secret.metadata.0.name
}
}
}
}
}
}
with following job config. curl NOMAD_IP_http:NOMAD_PORT_http cannot access http-echo service.
there is no listenig port on localhost for incomming request.
why and how to access the http-echo service
job "job" {
datacenters = ["dc1"]
group "group" {
count = 2
network {
port "http" {}
}
service {
name = "http-echo"
port = "http"
tags = [
"http-echo",
]
check {
type = "http"
path = "/health"
interval = "30s"
timeout = "2s"
}
}
task "task" {
driver = "docker"
config {
image = "hashicorp/http-echo:latest"
args = [
"-listen", ":${NOMAD_PORT_http}",
"-text", "Hello and welcome to ${NOMAD_IP_http} running on port ${NOMAD_PORT_http}",
]
}
resources {}
}
}
}
UPDATE
after config driver network_mode, curl successfully.
network_mode = "host"
You forgot to add ports at job -> group -> task ->ports
Now it works on latest nomad(v1.1.3+).
job "job" {
datacenters = ["dc1"]
group "group" {
count = 2
network {
port "http" {}
# or maps to container's default port
# port "http" {
# to = 5678
# }
#
}
service {
name = "http-echo"
port = "http"
tags = [
"http-echo",
]
check {
type = "http"
path = "/health"
interval = "30s"
timeout = "2s"
}
}
task "task" {
driver = "docker"
config {
image = "hashicorp/http-echo:latest"
args = [
"-listen", ":${NOMAD_PORT_http}",
"-text", "Hello and welcome to ${NOMAD_IP_http} running on port ${NOMAD_PORT_http}",
]
ports = ["http"]
}
resources {}
}
}
}
Then run docker ps, you will get the mapped port, and curl works.
I need advice how to set up authentication to Hashi-UI for management Nomad and Consul. I have Debian 8 server and there I installed Terraform, I created terraform file. This download and run Nomad and Consul. That works, but if I access to Hashi-UI there is no login, so everyone can access it. I run hashi like nomad job. It is run on Nginx. How can I set login for user like this for apache?
My Nomad file:
job "hashi-ui" {
region = "global"
datacenters = ["dc1"]
type = "service"
update {
stagger = "30s"
max_parallel = 2
}
group "server" {
count = 1
task "hashi-ui" {
driver = "docker"
config {
image = "jippi/hashi-ui"
network_mode = "host"
}
service {
port = "http"
check {
type = "http"
path = "/"
interval = "10s"
timeout = "2s"
}
}
env {
NOMAD_ENABLE = 1
NOMAD_ADDR = "http://0.0.0.0:4646"
CONSUL_ENABLE = 1
CONSUL_ADDR = "http://0.0.0.0:8500"
}
resources {
cpu = 500
memory = 512
network {
mbits = 5
port "http" {
static = 3000
}
}
}
}
task "nginx" {
driver = "docker"
config {
image = "ygersie/nginx-ldap-lua:1.11.3"
network_mode = "host"
volumes = [
"local/config/nginx.conf:/etc/nginx/nginx.conf"
]
}
template {
data = <<EOF
worker_processes 2;
events {
worker_connections 1024;
}
env NS_IP;
env NS_PORT;
http {
access_log /dev/stdout;
error_log /dev/stderr;
auth_ldap_cache_enabled on;
auth_ldap_cache_expiration_time 300000;
auth_ldap_cache_size 10000;
ldap_server ldap_server1 {
url ldaps://ldap.example.com/ou=People,dc=example,dc=com?uid?sub?(objectClass=inetOrgPerson);
group_attribute_is_dn on;
group_attribute member;
satisfy any;
require group "cn=secure-group,ou=Group,dc=example,dc=com";
}
map $http_upgrade $connection_upgrade {
default upgrade;
'' close;
}
server {
listen 15080;
location / {
auth_ldap "Login";
auth_ldap_servers ldap_server1;
set $target '';
set $service "hashi-ui.service.consul";
set_by_lua_block $ns_ip { return os.getenv("NS_IP") or "127.0.0.1" }
set_by_lua_block $ns_port { return os.getenv("NS_PORT") or 53 }
access_by_lua_file /etc/nginx/srv_router.lua;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;
proxy_read_timeout 31d;
proxy_pass http://$target;
}
}
}
EOF
destination = "local/config/nginx.conf"
change_mode = "noop"
}
service {
port = "http"
tags = [
"urlprefix-hashi-ui.example.com/"
]
check {
type = "tcp"
interval = "5s"
timeout = "2s"
}
}
resources {
cpu = 100
memory = 64
network {
mbits = 1
port "http" {
static = "15080"
}
}
}
}
}
}
Thank you for any advice.
Since you are using Nginx, you can easily enable authentication in Nginx. Here some useful links:
Basic Auth using Nginx: http://nginx.org/en/docs/http/ngx_http_auth_basic_module.html
LDAP Auth using Nginx: http://www.allgoodbits.org/articles/view/29
Interestingly, this problem is discussed in the HashiUI GitHub repo as well. Take a look at this approach too:
https://github.com/jippi/hashi-ui/blob/master/docs/authentication_example.md
Thanks,
Arul
I am beginner and I have problem to find solution for Terraform and Nomad. I need run Nomad and hashi-ui for web management of Nomad. I try to setup and run Nomad server via terrafom. Hashi-ui I have like nomad job. Nomad server and Hashi-ui run well. Hashi-ui I run in docker. Now I need to create terraform file for automation initial setup and orchestrate nomad. My server running on Debian 8.
My terraform file nomad.tf:
# Configure the Nomad provider
provider "nomad" {
address = "http://localhost:4646"
region = "global"
# group = "server"
}
variable "version" {
default = "latest"
}
data "template_file" "job" {
template = "${file("./hashi-ui.nomad")}"
vars {
version = "${var.version}"
}
}
# Register a job
resource "nomad_job" "hashi-ui" {
jobspec = "${data.template_file.job.rendered}"
}
And nomad job hashi-ui.nomad:
job "hashi-ui" {
region = "global"
datacenters = ["dc1"]
type = "service"
group "server" {
count = 1
task "hashi-ui" {
driver = "docker"
config {
image = "jippi/hashi-ui"
network_mode = "host"
}
service {
port = "http"
check {
type = "http"
path = "/"
interval = "10s"
timeout = "2s"
}
}
env {
NOMAD_ENABLE = 1
NOMAD_ADDR = "http://0.0.0.0:4646"
}
resources {
cpu = 500
memory = 512
network {
mbits = 5
port "http" {
static = 3000
}
}
}
}
}
}
Terraform plan shows changes, but terraform apply throws this error:
Error applying plan:
1 error(s) occurred:
nomad_job.hashi-ui: 1 error(s) occurred:
nomad_job.hashi-ui: error applying jobspec: Put http://localhost:4646/v1/jobs?region=global: dial tcp [::1]:4646: getsockopt: connection refused
Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.
If I run nomad server beside than error is
1 error(s) occurred:
nomad_job.hashi-ui: 1 error(s) occurred:
nomad_job.hashi-ui: error applying jobspec: Unexpected response code: 500 (1 error(s) occurred:
Task group server validation failed: 1 error(s) occurred:
2 error(s) occurred:
Max parallel can not be less than one: 0 < 1
Stagger must be greater than zero: 0s)
Can you help me please?
You're missing a max parallel and stagger in your nomad job spec:
job "hashi-ui" {
region = "global"
datacenters = ["dc1"]
type = "service"
update {
stagger = "30s"
max_parallel = 2
}
count = 1
task "hashi-ui" {
driver = "docker"
config {
image = "jippi/hashi-ui"
network_mode = "host"
}
...
My problem
I use nomad to schedule and deploy Docker images across several nodes. I am using a pretty stable image, so I want that image to be loaded locally rather than fetched from Dockerhub each time.
The docker.cleanup.image argument should do just that:
docker.cleanup.image Defaults to true. Changing this to false will prevent Nomad from removing images from stopped tasks, which is exactly what I want.
The documentation example is:
client {
options {
"docker.cleanup.image" = "false"
}
}
However, I don't know where this stanza goes. I tried placing it in the job or task sections of the fairly simple configuration file, with no success.
Code (configuration file)
job "example" {
datacenters = ["dc1"]
type = "service"
update {
max_parallel = 30
min_healthy_time = "10s"
healthy_deadline = "3m"
auto_revert = false
canary = 0
}
group "cache" {
count = 30
restart {
attempts = 10
interval = "5m"
delay = "25s"
mode = "delay"
}
ephemeral_disk {
size = 300
}
task "redis" {
driver = "docker"
config {
image = "whatever/whatever:v1"
port_map {
db = 80
}
}
env {
"LOGGER" = "ec2-52-58-216-66.eu-central-1.compute.amazonaws.com"
}
resources {
network {
mbits = 10
port "db" {}
}
}
service {
name = "global-redis-check"
tags = ["global", "cache"]
port = "db"
}
}
}
}
My question
Where do I place the client stanza in the nomad configuration file?
This doesn't go in your job file, it goes on the nomad agents (the clients where nomad jobs are deployed).