Cert-Manager Challenge pending, no error in MIC (Azure DNS) - azure-aks

I can't get TLS to work. The CertficateRequest gets created, the Order too and also the Challenge. However, the Challenge is stuck in pending.
Name: test-tls-secret-8qshd-3608253913-1269058669
Namespace: test
Labels: <none>
Annotations: <none>
API Version: acme.cert-manager.io/v1
Kind: Challenge
Metadata:
Creation Timestamp: 2022-07-19T08:17:04Z
Finalizers:
finalizer.acme.cert-manager.io
Generation: 1
Managed Fields:
API Version: acme.cert-manager.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.:
v:"finalizer.acme.cert-manager.io":
Manager: cert-manager-challenges
Operation: Update
Time: 2022-07-19T08:17:04Z
API Version: acme.cert-manager.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:ownerReferences:
.:
k:{"uid":"06029d3f-d1ce-45db-a267-796ff9b82a67"}:
f:spec:
.:
f:authorizationURL:
f:dnsName:
f:issuerRef:
.:
f:group:
f:kind:
f:name:
f:key:
f:solver:
.:
f:dns01:
.:
f:azureDNS:
.:
f:environment:
f:hostedZoneName:
f:resourceGroupName:
f:subscriptionID:
f:token:
f:type:
f:url:
f:wildcard:
Manager: cert-manager-orders
Operation: Update
Time: 2022-07-19T08:17:04Z
API Version: acme.cert-manager.io/v1
Fields Type: FieldsV1
fieldsV1:
f:status:
.:
f:presented:
f:processing:
f:reason:
f:state:
Manager: cert-manager-challenges
Operation: Update
Subresource: status
Time: 2022-07-19T08:25:38Z
Owner References:
API Version: acme.cert-manager.io/v1
Block Owner Deletion: true
Controller: true
Kind: Order
Name: test-tls-secret-8qshd-3608253913
UID: 06029d3f-d1ce-45db-a267-796ff9b82a67
Resource Version: 4528159
UID: 9594ed48-72c6-4403-8356-4991950fe9bb
Spec:
Authorization URL: https://acme-v02.api.letsencrypt.org/acme/authz-v3/131873811576
Dns Name: test.internal.<company_id>.com
Issuer Ref:
Group: cert-manager.io
Kind: ClusterIssuer
Name: letsencrypt
Key: xrnhZETWbkGTE7CA0A3CQd6a48d4JG4HKDiCXPpxTWM
Solver:
dns01:
Azure DNS:
Environment: AzurePublicCloud
Hosted Zone Name: internal.<company_id>.com
Resource Group Name: tool-cluster-rg
Subscription ID: <subscription_id>
Token: jXCR2UorNanlHqZd8T7Ifjbx6PuGfLBwnzWzBnDvCyc
Type: DNS-01
URL: https://acme-v02.api.letsencrypt.org/acme/chall-v3/131873811576/vCGdog
Wildcard: false
Status:
Presented: false
Processing: true
Reason: azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/<subscription_id>/resourceGroups/tool-cluster-rg/providers/Microsoft.Network/dnsZones/internal.<company_id>.com/TXT/_acme-challenge.test?api-version=2017-10-01: StatusCode=404 -- Original Error: adal: Refresh request failed. Status Code = '404'. Response body: getting assigned identities for pod cert-manager/cert-manager-5bb7949947-qlg5j in CREATED state failed after 16 attempts, retry duration [5]s, error: <nil>. Check MIC pod logs for identity assignment errors
Endpoint http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fmanagement.core.windows.net%2F
State: pending
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Started 59m cert-manager-challenges Challenge scheduled for processing
Warning PresentError 11s (x7 over 51m) cert-manager-challenges Error presenting challenge: azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/<subscription_id>/resourceGroups/tool-cluster-rg/providers/Microsoft.Network/dnsZones/internal.<company_id>.com/TXT/_acme-challenge.test?api-version=2017-10-01: StatusCode=404 -- Original Error: adal: Refresh request failed. Status Code = '404'. Response body: getting assigned identities for pod cert-manager/cert-manager-5bb7949947-qlg5j in CREATED state failed after 16 attempts, retry duration [5]s, error: <nil>. Check MIC pod logs for identity assignment errors
Endpoint http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fmanagement.core.windows.net%2F
It says to check the MIC pod logs, however, there are no errors logged:
I0719 08:16:52.271516 1 mic.go:587] pod test/test-deployment-b5dcc75f4-5gdtj has no assigned node yet. it will be ignored
I0719 08:16:52.284362 1 mic.go:608] No AzureIdentityBinding found for pod test/test-deployment-b5dcc75f4-5gdtj that matches selector: certman-label. it will be ignored
I0719 08:16:53.735678 1 mic.go:648] certman-identity identity not found when using test/certman-id-binding binding
I0719 08:16:53.737027 1 mic.go:1040] processing node aks-default-10282586-vmss, add [1], del [0], update [0]
I0719 08:16:53.737061 1 crd.go:514] creating assigned id test/test-deployment-b5dcc75f4-5gdtj-test-certman-identity
I0719 08:16:53.844892 1 cloudprovider.go:210] updating user-assigned identities on aks-default-10282586-vmss, assign [1], unassign [0]
I0719 08:17:04.545556 1 crd.go:777] updating AzureAssignedIdentity test/test-deployment-b5dcc75f4-5gdtj-test-certman-identity status to Assigned
I0719 08:17:04.564464 1 mic.go:525] work done: true. Found 1 pods, 1 ids, 1 bindings
I0719 08:17:04.564477 1 mic.go:526] total work cycles: 392, out of which work was done in: 320
I0719 08:17:04.564492 1 stats.go:183] ** stats collected **
I0719 08:17:04.564497 1 stats.go:162] Pod listing: 20.95µs
I0719 08:17:04.564504 1 stats.go:162] AzureIdentity listing: 2.357µs
I0719 08:17:04.564508 1 stats.go:162] AzureIdentityBinding listing: 3.211µs
I0719 08:17:04.564512 1 stats.go:162] AzureAssignedIdentity listing: 431ns
I0719 08:17:04.564516 1 stats.go:162] System: 71.101µs
I0719 08:17:04.564520 1 stats.go:162] CacheSync: 4.482µs
I0719 08:17:04.564523 1 stats.go:162] Cloud provider GET: 83.123547ms
I0719 08:17:04.564527 1 stats.go:162] Cloud provider PATCH: 10.700611864s
I0719 08:17:04.564531 1 stats.go:162] AzureAssignedIdentity creation: 24.654916ms
I0719 08:17:04.564535 1 stats.go:162] AzureAssignedIdentity update: 0s
I0719 08:17:04.564538 1 stats.go:162] AzureAssignedIdentity deletion: 0s
I0719 08:17:04.564542 1 stats.go:170] Number of cloud provider PATCH: 1
I0719 08:17:04.564546 1 stats.go:170] Number of cloud provider GET: 1
I0719 08:17:04.564549 1 stats.go:170] Number of AzureAssignedIdentities created in this sync cycle: 1
I0719 08:17:04.564554 1 stats.go:170] Number of AzureAssignedIdentities updated in this sync cycle: 0
I0719 08:17:04.564557 1 stats.go:170] Number of AzureAssignedIdentities deleted in this sync cycle: 0
I0719 08:17:04.564561 1 stats.go:162] Find AzureAssignedIdentities to create: 0s
I0719 08:17:04.564564 1 stats.go:162] Find AzureAssignedIdentities to delete: 0s
I0719 08:17:04.564568 1 stats.go:162] Total time to assign or update AzureAssignedIdentities: 10.827425179s
I0719 08:17:04.564573 1 stats.go:162] Total: 10.82763016s
I0719 08:17:04.564577 1 stats.go:212] *********************
I0719 08:19:34.077484 1 mic.go:1466] reconciling identity assignment for [/subscriptions/<subscription_id>/resourceGroups/tool-cluster-rg/providers/Microsoft.ManagedIdentity/userAssignedIdentities/cert-manager-dns01] on node aks-default-10282586-vmss
I0719 08:22:34.161195 1 mic.go:1466] reconciling identity assignment for [/subscriptions/<subscription_id>/resourceGroups/tool-cluster-rg/providers/Microsoft.ManagedIdentity/userAssignedIdentities/cert-manager-dns01] on node aks-default-10282586-vmss
The "reconciling identity" output gets repeated afterwards. Up to this point, I was able to handle my way through error messages, but now I have no idea how to proceed. Anyone got any lead what I'm missing?
Following my terraform code for the infrastructure.
terraform {
cloud {
organization = "<company_id>"
workspaces {
name = "tool-cluster"
}
}
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = ">= 3.6.0, < 4.0.0"
}
}
}
provider "azurerm" {
features {}
}
data "azurerm_client_config" "default" {}
variable "id" {
type = string
description = "Company wide unique terraform identifier"
default = "tool-cluster"
}
resource "azurerm_resource_group" "default" {
name = "${var.id}-rg"
location = "westeurope"
}
resource "azurerm_kubernetes_cluster" "default" {
name = "${var.id}-aks"
location = azurerm_resource_group.default.location
resource_group_name = azurerm_resource_group.default.name
dns_prefix = var.id
default_node_pool {
name = "default"
node_count = 1
vm_size = "Standard_D4_v5"
}
identity {
type = "SystemAssigned"
}
role_based_access_control_enabled = true
http_application_routing_enabled = true
}
resource "azurerm_dns_zone" "internal" {
name = "internal.<company_id>.com"
resource_group_name = azurerm_resource_group.default.name
}
resource "azurerm_user_assigned_identity" "dns_identity" {
name = "cert-manager-dns01"
resource_group_name = azurerm_resource_group.default.name
location = azurerm_resource_group.default.location
}
resource "azurerm_role_assignment" "dns_contributor" {
scope = azurerm_dns_zone.internal.id
role_definition_name = "DNS Zone Contributor"
principal_id = azurerm_user_assigned_identity.dns_identity.principal_id
}
I've added the roles "Managed Identity Operator" and "Virtual Machine Contributor" in the scope of the generated resourcegroup of the cluster (MC_tool-cluster-rg_tool-cluster-aks_westeurope) and "Managed Identity Operator" to the resource group of the cluster itself (tool-cluster-rg) to the kubelet_identity.
Code for the cert-manager:
terraform {
cloud {
organization = "<company_id>"
workspaces {
name = "cert-manager"
}
}
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = ">= 2.12.0, < 3.0.0"
}
helm = {
source = "hashicorp/helm"
version = ">= 2.6.0, < 3.0.0"
}
azurerm = {
source = "hashicorp/azurerm"
version = ">= 3.6.0, < 4.0.0"
}
}
}
data "terraform_remote_state" "tool-cluster" {
backend = "remote"
config = {
organization = "<company_id>"
workspaces = {
name = "tool-cluster"
}
}
}
provider "azurerm" {
features {}
}
provider "kubernetes" {
host = data.terraform_remote_state.tool-cluster.outputs.host
client_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_certificate)
client_key = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_key)
cluster_ca_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.cluster_ca_certificate)
}
provider "helm" {
kubernetes {
host = data.terraform_remote_state.tool-cluster.outputs.host
client_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_certificate)
client_key = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_key)
cluster_ca_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.cluster_ca_certificate)
}
}
locals {
app-name = "cert-manager"
}
resource "kubernetes_namespace" "cert_manager" {
metadata {
name = local.app-name
}
}
resource "helm_release" "cert_manager" {
name = local.app-name
repository = "https://charts.jetstack.io"
chart = "cert-manager"
version = "v1.8.2"
namespace = kubernetes_namespace.cert_manager.metadata.0.name
set {
name = "installCRDs"
value = "true"
}
}
resource "helm_release" "aad_pod_identity" {
name = "aad-pod-identity"
repository = "https://raw.githubusercontent.com/Azure/aad-pod-identity/master/charts"
chart = "aad-pod-identity"
version = "v4.1.10"
namespace = kubernetes_namespace.cert_manager.metadata.0.name
}
resource "azurerm_user_assigned_identity" "default" {
name = local.app-name
resource_group_name = data.terraform_remote_state.tool-cluster.outputs.resource_name
location = data.terraform_remote_state.tool-cluster.outputs.resource_location
}
resource "azurerm_role_assignment" "default" {
scope = data.terraform_remote_state.tool-cluster.outputs.dns_zone_id
role_definition_name = "DNS Zone Contributor"
principal_id = azurerm_user_assigned_identity.default.principal_id
}
output "namespace" {
value = kubernetes_namespace.cert_manager.metadata.0.name
sensitive = false
}
and the code for my issuer:
terraform {
cloud {
organization = "<company_id>"
workspaces {
name = "cert-issuer"
}
}
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = ">= 2.12.0, < 3.0.0"
}
helm = {
source = "hashicorp/helm"
version = ">= 2.6.0, < 3.0.0"
}
azurerm = {
source = "hashicorp/azurerm"
version = ">= 3.6.0, < 4.0.0"
}
}
}
data "terraform_remote_state" "tool-cluster" {
backend = "remote"
config = {
organization = "<company_id>"
workspaces = {
name = "tool-cluster"
}
}
}
data "terraform_remote_state" "cert-manager" {
backend = "remote"
config = {
organization = "<company_id>"
workspaces = {
name = "cert-manager"
}
}
}
provider "azurerm" {
features {}
}
provider "kubernetes" {
host = data.terraform_remote_state.tool-cluster.outputs.host
client_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_certificate)
client_key = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_key)
cluster_ca_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.cluster_ca_certificate)
}
provider "helm" {
kubernetes {
host = data.terraform_remote_state.tool-cluster.outputs.host
client_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_certificate)
client_key = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_key)
cluster_ca_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.cluster_ca_certificate)
}
}
locals {
app-name = "cert-manager"
}
data "azurerm_subscription" "current" {}
resource "kubernetes_manifest" "cluster_issuer" {
manifest = yamldecode(templatefile(
"${path.module}/cluster-issuer.tpl.yaml",
{
"name" = "letsencrypt"
"subscription_id" = data.azurerm_subscription.current.subscription_id
"resource_group_name" = data.terraform_remote_state.tool-cluster.outputs.resource_name
"dns_zone_name" = data.terraform_remote_state.tool-cluster.outputs.dns_zone_name
}
))
}
Also, the yaml:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: ${name}
spec:
acme:
email: support#<company_id>.com
server: https://acme-v02.api.letsencrypt.org/directory
privateKeySecretRef:
name: ${name}
solvers:
- dns01:
azureDNS:
resourceGroupName: ${resource_group_name}
subscriptionID: ${subscription_id}
hostedZoneName: ${dns_zone_name}
environment: AzurePublicCloud
Finally, my sample app:
terraform {
cloud {
organization = "<company_id>"
workspaces {
name = "test-web-app"
}
}
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = ">= 2.12.0, < 3.0.0"
}
azurerm = {
source = "hashicorp/azurerm"
version = ">= 3.6.0, < 4.0.0"
}
azuread = {
source = "hashicorp/azuread"
version = ">= 2.26.0, < 3.0.0"
}
}
}
data "terraform_remote_state" "tool-cluster" {
backend = "remote"
config = {
organization = "<company_id>"
workspaces = {
name = "tool-cluster"
}
}
}
provider "azuread" {}
provider "azurerm" {
features {}
}
provider "kubernetes" {
host = data.terraform_remote_state.tool-cluster.outputs.host
client_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_certificate)
client_key = base64decode(data.terraform_remote_state.tool-cluster.outputs.client_key)
cluster_ca_certificate = base64decode(data.terraform_remote_state.tool-cluster.outputs.cluster_ca_certificate)
}
locals {
app-name = "test"
host = "test.${data.terraform_remote_state.tool-cluster.outputs.cluster_domain_name}"
}
resource "azurerm_dns_cname_record" "default" {
name = local.app-name
zone_name = data.terraform_remote_state.tool-cluster.outputs.dns_zone_name
resource_group_name = data.terraform_remote_state.tool-cluster.outputs.resource_name
ttl = 300
record = local.host
}
resource "azuread_application" "default" {
display_name = local.app-name
}
resource "kubernetes_namespace" "default" {
metadata {
name = local.app-name
}
}
resource "kubernetes_secret" "auth" {
metadata {
name = "basic-auth"
namespace = kubernetes_namespace.default.metadata.0.name
}
data = {
"auth" = file("./auth")
}
}
resource "kubernetes_deployment" "default" {
metadata {
name = "${local.app-name}-deployment"
namespace = kubernetes_namespace.default.metadata.0.name
labels = {
app = local.app-name
}
}
spec {
replicas = 1
selector {
match_labels = {
app = local.app-name
}
}
template {
metadata {
labels = {
app = local.app-name
aadpodidbinding = "certman-label"
}
}
spec {
container {
image = "crccheck/hello-world:latest"
name = local.app-name
port {
container_port = 8000
host_port = 8000
}
}
}
}
}
}
resource "kubernetes_service" "default" {
metadata {
name = "${local.app-name}-svc"
namespace = kubernetes_namespace.default.metadata.0.name
}
spec {
selector = {
app = kubernetes_deployment.default.metadata.0.labels.app
}
port {
port = 8000
target_port = 8000
}
}
}
resource "kubernetes_ingress_v1" "default" {
metadata {
name = "${local.app-name}-ing"
namespace = kubernetes_namespace.default.metadata.0.name
annotations = {
"kubernetes.io/ingress.class" = "addon-http-application-routing"
"cert-manager.io/cluster-issuer" = "letsencrypt"
# basic-auth
"nginx.ingress.kubernetes.io/auth-type" = "basic"
"nginx.ingress.kubernetes.io/auth-secret" = "basic-auth"
"nginx.ingress.kubernetes.io/auth-realm" = "Authentication Required - foo"
}
}
spec {
rule {
host = local.host
http {
path {
path = "/"
backend {
service {
name = kubernetes_service.default.metadata.0.name
port {
number = 8000
}
}
}
}
}
}
rule {
host = trimsuffix(azurerm_dns_cname_record.default.fqdn, ".")
http {
path {
path = "/"
backend {
service {
name = kubernetes_service.default.metadata.0.name
port {
number = 8000
}
}
}
}
}
}
tls {
hosts = [ trimsuffix(azurerm_dns_cname_record.default.fqdn, ".") ]
secret_name = "${local.app-name}-tls-secret"
}
}
}
resource "kubernetes_manifest" "azure_identity" {
manifest = yamldecode(templatefile(
"${path.module}/azure_identity.tpl.yaml",
{
"namespace" = kubernetes_namespace.default.metadata.0.name
"resource_id" = data.terraform_remote_state.tool-cluster.outputs.identity_resource_id
"client_id" = data.terraform_remote_state.tool-cluster.outputs.identity_client_id
}
))
}
resource "kubernetes_manifest" "azure_identity_binding" {
manifest = yamldecode(templatefile(
"${path.module}/azure_identity_binding.tpl.yaml",
{
"namespace" = kubernetes_namespace.default.metadata.0.name
"resource_id" = data.terraform_remote_state.tool-cluster.outputs.identity_resource_id
"client_id" = data.terraform_remote_state.tool-cluster.outputs.identity_client_id
}
))
}
The two identity yaml:
apiVersion: "aadpodidentity.k8s.io/v1"
kind: AzureIdentity
metadata:
annotations:
# recommended to use namespaced identites https://azure.github.io/aad-pod-identity/docs/configure/match_pods_in_namespace/
aadpodidentity.k8s.io/Behavior: namespaced
name: certman-identity
namespace: ${namespace} # change to your preferred namespace
spec:
type: 0 # MSI
resourceID: ${resource_id} # Resource Id From Previous step
clientID: ${client_id} # Client Id from previous step
and
apiVersion: "aadpodidentity.k8s.io/v1"
kind: AzureIdentityBinding
metadata:
name: certman-id-binding
namespace: ${namespace} # change to your preferred namespace
spec:
azureIdentity: certman-identity
selector: certman-label # This is the label that needs to be set on cert-manager pods
edit: reformatted

I was not able to solve it with http application routing, so I installed my own ingress and instead of aad-pod-identity I installed ExternalDNS with Service Principal. The terraform code for that:
locals {
app-name = "external-dns"
}
resource "azuread_application" "dns" {
display_name = "dns-service_principal"
}
resource "azuread_application_password" "dns" {
application_object_id = azuread_application.dns.object_id
}
resource "azuread_service_principal" "dns" {
application_id = azuread_application.dns.application_id
description = "Service Principal to write DNS changes for ${data.terraform_remote_state.tool-cluster.outputs.dns_zone_name}"
}
resource "azurerm_role_assignment" "dns_zone_contributor" {
scope = data.terraform_remote_state.tool-cluster.outputs.dns_zone_id
role_definition_name = "DNS Zone Contributor"
principal_id = azuread_service_principal.dns.id
}
resource "azurerm_role_assignment" "rg_reader" {
scope = data.terraform_remote_state.tool-cluster.outputs.dns_zone_id
role_definition_name = "Reader"
principal_id = azuread_service_principal.dns.id
}
resource "kubernetes_secret" "external_dns_secret" {
metadata {
name = "azure-config-file"
}
data = { "azure.json" = jsonencode({
tenantId = data.azurerm_subscription.default.tenant_id
subscriptionId = data.azurerm_subscription.default.subscription_id
resourceGroup = data.terraform_remote_state.tool-cluster.outputs.resource_name
aadClientId = azuread_application.dns.application_id
aadClientSecret = azuread_application_password.dns.value
})
}
}
resource "kubernetes_service_account" "dns" {
metadata {
name = local.app-name
}
}
resource "kubernetes_cluster_role" "dns" {
metadata {
name = local.app-name
}
rule {
api_groups = [ "" ]
resources = [ "services","endpoints","pods", "nodes" ]
verbs = [ "get","watch","list" ]
}
rule {
api_groups = [ "extensions","networking.k8s.io" ]
resources = [ "ingresses" ]
verbs = [ "get","watch","list" ]
}
}
resource "kubernetes_cluster_role_binding" "dns" {
metadata {
name = "${local.app-name}-viewer"
}
role_ref {
api_group = "rbac.authorization.k8s.io"
kind = "ClusterRole"
name = kubernetes_cluster_role.dns.metadata.0.name
}
subject {
kind = "ServiceAccount"
name = kubernetes_service_account.dns.metadata.0.name
}
}
resource "kubernetes_deployment" "dns" {
metadata {
name = local.app-name
}
spec {
strategy {
type = "Recreate"
}
selector {
match_labels = {
"app" = local.app-name
}
}
template {
metadata {
labels = {
"app" = local.app-name
}
}
spec {
service_account_name = kubernetes_service_account.dns.metadata.0.name
container {
name = local.app-name
image = "bitnami/external-dns:0.12.1"
args = [ "--source=service", "--source=ingress", "--provider=azure", "--txt-prefix=externaldns-" ]
volume_mount {
name = kubernetes_secret.external_dns_secret.metadata.0.name
mount_path = "/etc/kubernetes"
read_only = true
}
}
volume {
name = kubernetes_secret.external_dns_secret.metadata.0.name
secret {
secret_name = kubernetes_secret.external_dns_secret.metadata.0.name
}
}
}
}
}
}

Related

How to create multiple alerts for single resource in azure using terraform

How to create multiple alerts for single resource in azure using terraform (i.e CPU, Memory & Disk I/O alerts of a VM)
Please check this below code to create multiple alerts for single resource
provider "azurerm"
features{}
}
resource "azurerm_resource_group" "rgv" {
name = "<resource group name>"
location = "west us"
}
resource "azurerm_monitor_action_group" "agv" {
name = "myactiongroup"
resource_group_name = azurerm_resource_group.rgv.name
short_name = "exampleact"
}
resource "azurerm_monitor_metric_alert" "alert" {
name = "example-metricalert"
resource_group_name = azurerm_resource_group.rgv.name
scopes = ["/subscriptions/1234XXXXXX/resourceGroups/<rg name>/providers/Microsoft.Compute/virtualMachines/<virtualmachine name>"]
description = "description"
target_resource_type = "Microsoft.Compute/virtualMachines"
criteria {
metric_namespace = "Microsoft.Compute/virtualMachines"
metric_name = "Percentage CPU"
aggregation = "Total"
operator = "GreaterThan"
threshold = 80
}
action {
action_group_id = azurerm_monitor_action_group.agv.id
}
}
Reference: hashicorp azurerm_monitor_metric_alert

Azure Terraform Web App private Endpoint virtual network

I am trying to automate the deployment of an azure virtual network and azure web app.
During the deployment of those resources, everything went just fine and no errors. So I wanted to try to activate the private endpoint on the web app. This is my configuration on terraform.
resource "azurerm_virtual_network" "demo-vnet" {
name = "virtual-network-test"
address_space = ["10.100.0.0/16"]
location = var.location
resource_group_name = azurerm_resource_group.rg-testing-env.name
}
resource "azurerm_subnet" "front_end" {
name = "Front_End-Subnet"
address_prefixes = ["10.100.5.0/28"]
virtual_network_name = azurerm_virtual_network.demo-vnet.name
resource_group_name = azurerm_resource_group.rg-testing-env.name
delegation {
name = "testing-frontend"
service_delegation {
name = "Microsoft.Web/serverFarms"
actions = ["Microsoft.Network/virtualNetworks/subnets/action"]
}
}
}
And on the web app itself, I set this configuration
resource "azurerm_app_service_virtual_network_swift_connection" "web-app-vnet" {
app_service_id = azurerm_app_service.app-test.example.id
subnet_id = azurerm_subnet.front_end.id
}
NOTE: On my first deployment, the swift failed because I had not delegation on the virtual network, so I had to add the delegation on the subnet to be able to run terraform.
After setting in place all the configuration, I run my terraform, everything run just smoothly, no errors.
After the completion, I checked my web app Private Endpoint and that was just off.
Can please anyone explain me what am I doing wrong here?. I thought that the swift connection was the block of code to activate the Private endpoint but apparently I am missing something else.
Just to confirm my logic workflow, I tried to do the manual steps in the portal. But surprisingly I was not able because I have the delegation on the subnet, as you can see.
Thank you so much for any help and/or explanation you can offer to solve this issue
I have used the below code to test the creation of VNET and Web app with private endpoint.
provider "azurerm" {
features{}
}
data "azurerm_resource_group" "rg" {
name = "ansumantest"
}
# Virtual Network
resource "azurerm_virtual_network" "vnet" {
name = "ansumanapp-vnet"
location = data.azurerm_resource_group.rg.location
resource_group_name = data.azurerm_resource_group.rg.name
address_space = ["10.4.0.0/16"]
}
# Subnets for App Service instances
resource "azurerm_subnet" "appserv" {
name = "frontend-app"
resource_group_name = data.azurerm_resource_group.rg.name
virtual_network_name = azurerm_virtual_network.vnet.name
address_prefixes = ["10.4.1.0/24"]
enforce_private_link_endpoint_network_policies = true
}
# App Service Plan
resource "azurerm_app_service_plan" "frontend" {
name = "ansuman-frontend-asp"
location = data.azurerm_resource_group.rg.location
resource_group_name = data.azurerm_resource_group.rg.name
kind = "Linux"
reserved = true
sku {
tier = "Premium"
size = "P1V2"
}
}
# App Service
resource "azurerm_app_service" "frontend" {
name = "ansuman-frontend-app"
location = data.azurerm_resource_group.rg.location
resource_group_name = data.azurerm_resource_group.rg.name
app_service_plan_id = azurerm_app_service_plan.frontend.id
}
#private endpoint
resource "azurerm_private_endpoint" "example" {
name = "${azurerm_app_service.frontend.name}-endpoint"
location = data.azurerm_resource_group.rg.location
resource_group_name = data.azurerm_resource_group.rg.name
subnet_id = azurerm_subnet.appserv.id
private_service_connection {
name = "${azurerm_app_service.frontend.name}-privateconnection"
private_connection_resource_id = azurerm_app_service.frontend.id
subresource_names = ["sites"]
is_manual_connection = false
}
}
# private DNS
resource "azurerm_private_dns_zone" "example" {
name = "privatelink.azurewebsites.net"
resource_group_name = data.azurerm_resource_group.rg.name
}
#private DNS Link
resource "azurerm_private_dns_zone_virtual_network_link" "example" {
name = "${azurerm_app_service.frontend.name}-dnslink"
resource_group_name = data.azurerm_resource_group.rg.name
private_dns_zone_name = azurerm_private_dns_zone.example.name
virtual_network_id = azurerm_virtual_network.vnet.id
registration_enabled = false
}
Requirements:
As you can see from the above code the Private Endpoint , Private DNS and Private DNS Link block are required for creating the private endpoint and enabling it for the app service.
The App service Plan needs to have Premium Plan for having Private
endpoint.
The Subnet to be used by Private Endpoint should have
enforce_private_link_endpoint_network_policies = true set other
wise it will error giving message as subnet has private endpoint network policies enabled , it should be disabled to be used by Private endpoint.
DNS zone name should only be privatelink.azurewebsites.net as you are creating a private endpoint for webapp.
Outputs:

cannot access nomad docker task via local ip:port

with following job config. curl NOMAD_IP_http:NOMAD_PORT_http cannot access http-echo service.
there is no listenig port on localhost for incomming request.
why and how to access the http-echo service
job "job" {
datacenters = ["dc1"]
group "group" {
count = 2
network {
port "http" {}
}
service {
name = "http-echo"
port = "http"
tags = [
"http-echo",
]
check {
type = "http"
path = "/health"
interval = "30s"
timeout = "2s"
}
}
task "task" {
driver = "docker"
config {
image = "hashicorp/http-echo:latest"
args = [
"-listen", ":${NOMAD_PORT_http}",
"-text", "Hello and welcome to ${NOMAD_IP_http} running on port ${NOMAD_PORT_http}",
]
}
resources {}
}
}
}
UPDATE
after config driver network_mode, curl successfully.
network_mode = "host"
You forgot to add ports at job -> group -> task ->ports
Now it works on latest nomad(v1.1.3+).
job "job" {
datacenters = ["dc1"]
group "group" {
count = 2
network {
port "http" {}
# or maps to container's default port
# port "http" {
# to = 5678
# }
#
}
service {
name = "http-echo"
port = "http"
tags = [
"http-echo",
]
check {
type = "http"
path = "/health"
interval = "30s"
timeout = "2s"
}
}
task "task" {
driver = "docker"
config {
image = "hashicorp/http-echo:latest"
args = [
"-listen", ":${NOMAD_PORT_http}",
"-text", "Hello and welcome to ${NOMAD_IP_http} running on port ${NOMAD_PORT_http}",
]
ports = ["http"]
}
resources {}
}
}
}
Then run docker ps, you will get the mapped port, and curl works.

How, if possible, to add "Archive to a storage account" in terraform?

Is there an option to enable "Archive to a storage account" in Keyvault diagnostic in Azure provider of Terraform?
If you want to configure diagnostic settings for Azure Key Vault, we can use the azurerm_monitor_diagnostic_setting resource to configure it. For more details, please refer to here
For example
Create a service principal
az login
az account set --subscription "SUBSCRIPTION_ID"
az ad sp create-for-rbac --role "Contributor" --scopes "/subscriptions/<subscription_id>"
Script
provider "azurerm" {
version = "~>2.0"
subscription_id = ""
client_id = "sp appId"
client_secret = "sp password"
tenant_id = "sp tenant"
features {}
}
data "azurerm_storage_account" "mystorage" {
name = ""
resource_group_name = ""
}
data "azurerm_key_vault" "mykey" {
name = ""
resource_group_name =""
}
resource "azurerm_monitor_diagnostic_setting" "example" {
name = "example"
target_resource_id = data.azurerm_key_vault.mykey.id
storage_account_id = data.azurerm_storage_account.mystorage.id
log {
category = "AuditEvent"
enabled = false
retention_policy {
enabled = false
}
}
metric {
category = "AllMetrics"
retention_policy {
enabled = false
}
}
}

How to pull docker image from public registry with nomad job?

I'am using nomad on GCE and I cannot pull docker images from the public registry.
I can do a pull form the command line with docker pull gerlacdt/helloapp:v0.1.0
But when trying to run a nomad job with a public registry image, I have this error:
Failed to find docker auth for repo "gerlacdt/helloapp": docker-credential-gcr
Relevant files :
The /root/.docker/config.json file:
{
"auths": {
"https://index.docker.io/v1/": {}
},
"credHelpers": {
"asia.gcr.io": "gcr",
"eu.gcr.io": "gcr",
"gcr.io": "gcr",
"staging-k8s.gcr.io": "gcr",
"us.gcr.io": "gcr"
}
}
The nomad client config:
datacenter = "europe-west1-c"
name = "consul-clients-092s"
region = "europe-west1"
bind_addr = "0.0.0.0"
advertise {
http = "172.27.3.132"
rpc = "172.27.3.132"
serf = "172.27.3.132"
}
client {
enabled = true
options = {
"docker.auth.config" = "/root/.docker/config.json"
"docker.auth.helper" = "gcr"
}
}
consul {
address = "127.0.0.1:8500"
}
The job file:
job "helloapp" {
datacenters = ["europe-west1-b", "europe-west1-c", "europe-west1-d"]
constraint {
attribute = "${attr.kernel.name}"
value = "linux"
}
# Configure the job to do rolling updates
update {
stagger = "10s"
max_parallel = 1
}
group "hello" {
count = 1
restart {
attempts = 2
interval = "1m"
delay = "10s"
mode = "fail"
}
# Define a task to run
task "hello" {
driver = "docker"
config {
image = "gerlacdt/helloapp:v0.1.0"
port_map {
http = 8080
}
}
service {
name = "${TASKGROUP}-service"
tags = [
# "traefik.tags=public",
"traefik.frontend.rule=Host:bla.zapto.org",
"traefik.frontend.entryPoints=http",
"traefik.tags=exposed"
]
port = "http"
check {
name = "alive"
type = "http"
interval = "10s"
timeout = "3s"
path = "/health"
}
}
resources {
cpu = 500 # 500 MHz
memory = 128 # 128MB
network {
mbits = 1
port "http" {
}
}
}
logs {
max_files = 10
max_file_size = 15
}
kill_timeout = "10s"
}
}
}
The complete error message from nomad client logs:
failed to initialize task "hello" for alloc "c845bdb9-500a-dc40-0f17-2b79fe4866f1": Failed to find docker auth for repo "gerlacdt/helloapp": docker-credential-gcr with input "gerlacdt/helloapp" failed with stderr:

Resources