Why is the Network Watcher on Azure not destroyed by Terraform? - terraform-provider-azure

I have a simple Terraform configuration to create azure virtual network. When I do plan and then apply, a virtual network is created inside of a resource group as expected. But in addition to this resource group, there is one more created by the name NetworkWatcherRG, and inside of it I see a network watcher.
And the network watcher.
Now when I run the Terraform destroy command, I expect that every thing is cleaned up, all the Resource groups are destroyed. But instead, everything except for the NetworkWatcherRG and the Network Watcher inside of it are destroyed.
Looks like the Network Watcher along with its resource group, is NOT managed by Terraform. What am I missing?
The network watcher is not immediately obvious. Its not reveled immediately. So to see that, you need to go the simplified view of the resource groups. You need to click the Refresh button atleast 5 times(each time with a 2 second time gap) or you have to wait for long time and then click refresh.
So what is this network watcher and is it that Azure is creating it by itself and not managed by Terraform?
My Terraform configuration file is as follows.
# Terraform settings Block
terraform {
required_version = ">= 1.0.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = ">= 2.0"
}
}
}
# Provider Block
provider "azurerm" {
features {}
}
# create virtual network
resource "azurerm_virtual_network" "myvnet" {
name = "vivek-1-vnet"
address_space = ["10.0.0.0/16"] # This is a list, it has []. If it has { }, then its a map.
location = azurerm_resource_group.myrg.location
resource_group_name = azurerm_resource_group.myrg.name
tags = { # This is a map. This is {}
"name" = "vivek-1-vnet"
}
}
# Resource-1: Azure Resource Group
resource "azurerm_resource_group" "myrg" {
name = "vivek-vnet-rg"
location = var.resource_group_location
}
variable "resource_group_location" {
default = "centralindia"
description = "Location of the resource group."
}
And finally the commands I use are as follows.
terraform fmt
terraform init
terraform validate
terraform plan -out main.tfplan
terraform apply main.tfplan
terraform plan -destroy -out main.destroy.tfplan
terraform apply main.destroy.tfplan

I read the response from #RahulKumarShaw-MT . I believe the answer and it makes complete sense that terraform won't destroy resources it didn't create (unless someone can demonstrate otherwise). That said, I was able to delete the NetworkWatcherRG group using terraform! What I did to achieve this was I made sure to add a network watcher as one of my declared resources using azurerm_network_watcher (see https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/network_watcher) in the same terraform script where I requested a virtual machine resource in another separate resource group. I think you created a vnet. My script creates a vnet too, and hence why I think Azure concludes that there is a need for a network watcher maybe? I named the first resource group, which contains my network watcher, whatever I wanted; doesn't have to be 'NetworkWatcherRG'. I watched the resource group be created and destroyed successfully with Terraform (using terraform apply and terraform destroy respectively, of course) along with my VM and vnet resources. Anyway, at the end, I refreshed the Azure Portal web page and saw no resource groups or resources in my test subscription. I'm not an Azure expert, but I suspect that if Azure already sees a network watcher present, then it won't create an additional one when terraform created my resources (e.g. - in my case a vm and a vnet), as a watcher will already be present as long as terraform creates that resource first before Azure gets a chance to.

Before applying terraform code i checked in my resource groups with name network watcher resource group for me , by default this resource grpup is created by Azure side.
As Mike-Ubezzi wrote on Microsoft forums:
Network Watcher resources are located in the hidden NetworkWatcherRG
resource group which is created automatically. For example, the NSG
Flow Logs resource is a child resource of Network Watcher and is
enabled in the NetworkWatcherRG.
The Network Watcher resource represents the backend service for
Network Watcher and is fully managed by Azure. Customers do no need to
manage it. Operations like move are not supported on the resource.
However, the resource can be
deleted.
So terraform destroy will only delete the resource created by you(mentioned in .tfstate file).This is the region you won't able to delete the NetworkWatcherRG Resource Group.

Related

Domain Mapping to point to a tag inside a service on Cloud Run

right now I'm deploying to cloud run and run
gcloud run deploy myapp --tag pr123 --no-traffic
I can then access the app via
https://pr123---myapp-jo5dg6hkf-ez.a.run.app
Now I would like to have a custom domain mapping going to this tag. I know how to point a custom domain to the service but I don't know how to point it to the tagged version of my service.
Can I add labels to the DomainMapping that would cause the mapping to got this version of my cloud run service? Or is there a routeName, eg. myapp#pr123 that would do the trick there?
In the end I would like to have
https://pr123.dev.mydomain.com
being the endpoint for this service.
With a custom domain, you configure a DNS to point to a service, not a revision/tag of the service. So, you can't by this way.
The solution is to use a load balancer with a serverless NEG. The most important is to define the URL mask that you want to map the tag and service from the URL which is received by the Load Balancer.
I ended up building the loadbalancer with a network endpoint group (as suggested). For further reference, here is my terraform snippet to create it. The part is then the traffic tag you assign to your revision.
resource "google_compute_region_network_endpoint_group" "api_neg" {
name = "api-neg"
network_endpoint_type = "SERVERLESS"
region = "europe-west3"
cloud_run {
service = data.google_cloud_run_service.api_dev.name
url_mask = "<tag>.preview.mydomain.com"
}
}

can't use log analytics workspace in a different subscription? terraform azurerm policy assignment

I'm using terraform to write azure policy as code
I found two problems
1 I can't seem to use log analytics workspace that is on a different subscription, within same subscription, it's fine
2 For policies that needs managed identity, I can't seem to assign correct rights to it.
resource "azurerm_policy_assignment" "Enable_Azure_Monitor_for_VMs" {
name = "Enable Azure Monitor for VMs"
scope = data.azurerm_subscription.current.id
policy_definition_id = "/providers/Microsoft.Authorization/policySetDefinitions/55f3eceb-5573-4f18-9695-226972c6d74a"
description = "Enable Azure Monitor for the virtual machines (VMs) in the specified scope (management group, subscription or resource group). Takes Log Analytics workspace as parameter."
display_name = "Enable Azure Monitor for VMs"
location = var.location
metadata = jsonencode(
{
"category" : "General"
})
parameters = jsonencode({
"logAnalytics_1" : {
"value" : var.log_analytics_workspace_ID
}
})
identity {
type = "SystemAssigned"
}
}
resource "azurerm_role_assignment" "vm_policy_msi_assignment" {
scope = azurerm_policy_assignment.Enable_Azure_Monitor_for_VMs.scope
role_definition_name = "Contributor"
principal_id = azurerm_policy_assignment.Enable_Azure_Monitor_for_VMs.identity[0].principal_id
}
for var.log_analytics_workspace_ID, if i use the workspace id that is in the same subscription as the policy, it would work fine. but If I use a workspace ID from a different subscription, after deployment, the workspace field will be blank.
also for
resource "azurerm_role_assignment" "vm_policy_msi_assignment"
, I have already given myself user access management role, but after deployment, "This identity currently has the following permissions:" is still blank?
I got an answer to my own question:)
1 this is not something designed well in Azure, I recon.
MS states "a Managed Identity (MSI) is created for each policy assignment that contains DeployIfNotExists effects in the definitions. The required permission for the target assignment scope is managed automatically. However, if the remediation tasks need to interact with resources outside of the assignment scope, you will need to manually configure the required permissions."
which means, the system generated managed identity which needs access in log analytics workspace in another subscription need to be manually with log analytics workspace contributor rights
Also since you can't user user generated managed ID, you can't pre-populate this.
so if you want to to achieve in terraform, it seems you have to run policy assignment twice, the first time is just to get ID, then manual ( or via script) to assign permission, then run policy assignment again to point to the resource..
2 The ID was actually given the contributor rights, you just have to go into sub RBAC to see it.

How to use a Google Secret in a deployed Cloud Run Service (managed)?

I have a running cloud run service user-service. For test purposes I passed client secrets via environment variables as plain text. Now since everything is working fine I'd like to use a secret instead.
In the "Variables" tab of the "Edit Revision" option I can declare environment variables but I have no idea how to pass in a secret? Do I just need to pass the secret name like ${my-secret-id} in the value field of the variable? There is not documentation on how to use secrets in this tab only a hint at the top:
Store and consume secrets using Secret Manager
Which is not very helpful in this case.
You can now read secrets from Secret Manager as environment variables in Cloud Run. This means you can audit your secrets, set permissions per secret, version secrets, etc, and your code doesn't have to change.
You can point to the secrets through the Cloud Console GUI (console.cloud.google.com) or make the configuration when you deploy your Cloud Run service from the command-line:
gcloud beta run deploy SERVICE --image IMAGE_URL --update-secrets=ENV_VAR_NAME=SECRET_NAME:VERSION
Six-minute video overview: https://youtu.be/JIE89dneaGo
Detailed docs: https://cloud.google.com/run/docs/configuring/secrets
UPDATE 2021: There is now a Cloud Run preview for loading secrets to an environment variable or a volume. https://cloud.google.com/run/docs/configuring/secrets
The question is now answered however I have been experiencing a similar problem using Cloud Run with Java & Quarkus and a native image created using GraalVM.
While Cloud Run is a really interesting technology at the time of writing it lacks the ability to load secrets through the Cloud Run configuration. This has certainly added complexity in my app when doing local development.
Additionally Google's documentation is really quite poor. The quick-start lacks a clear Java example for getting a secret[1] without it being set in the same method - I'd expect this to have been the most common use case!
The javadoc itself seems to be largely autogenerated with protobuf language everywhere. There are various similarly named methods like getSecret, getSecretVersion and accessSecretVersion
I'd really like to see some improvment from Google around this. I don't think it is asking too much for dedicated teams to make libraries for common languages with proper documentation.
Here is a snippet that I'm using to load this information. It requires the GCP Secret library and also the GCP Cloud Core library for loading the project ID.
public String getSecret(final String secretName) {
LOGGER.info("Going to load secret {}", secretName);
// SecretManagerServiceClient should be closed after request
try (SecretManagerServiceClient client = buildClient()) {
// Latest is an alias to the latest version of a secret
final SecretVersionName name = SecretVersionName.of(getProjectId(), secretName, "latest");
return client.accessSecretVersion(name).getPayload().getData().toStringUtf8();
}
}
private String getProjectId() {
if (projectId == null) {
projectId = ServiceOptions.getDefaultProjectId();
}
return projectId;
}
private SecretManagerServiceClient buildClient() {
try {
return SecretManagerServiceClient.create();
} catch(final IOException e) {
throw new RuntimeException(e);
}
}
[1] - https://cloud.google.com/secret-manager/docs/reference/libraries
Google have documentation for the Secret manager client libraries that you can use in your api.
This should help you do what you want
https://cloud.google.com/secret-manager/docs/reference/libraries
Since you haven't specified a language I have a nodejs example of how to access the latest version of your secret using your project id and secret name. The reason I add this is because the documentation is not clear on the string you need to provide as the name.
const [version] = await this.secretClient.accessSecretVersion({
name: `projects/${process.env.project_id}/secrets/${secretName}/versions/latest`,
});
return version.payload.data.toString()
Be sure to allow secret manager access in your IAM settings for the service account that your api uses within GCP.
I kinda found a way to use secrets as environment variables.
The following doc (https://cloud.google.com/sdk/gcloud/reference/run/deploy) states:
Specify secrets to mount or provide as environment variables. Keys
starting with a forward slash '/' are mount paths. All other keys
correspond to environment variables. The values associated with each
of these should be in the form SECRET_NAME:KEY_IN_SECRET; you may omit
the key within the secret to specify a mount of all keys within the
secret. For example:
'--update-secrets=/my/path=mysecret,ENV=othersecret:key.json' will
create a volume with secret 'mysecret' and mount that volume at
'/my/path'. Because no secret key was specified, all keys in
'mysecret' will be included. An environment variable named ENV will
also be created whose value is the value of 'key.json' in
'othersecret'. At most one of these may be specified
Here is a snippet of Java code to get all secrets of your Cloud Run project. It requires the com.google.cloud/google-cloud-secretmanager artifact.
Map<String, String> secrets = new HashMap<>();
String projectId;
String url = "http://metadata.google.internal/computeMetadata/v1/project/project-id";
HttpURLConnection conn = (HttpURLConnection)(new URL(url).openConnection());
conn.setRequestProperty("Metadata-Flavor", "Google");
try {
InputStream in = conn.getInputStream();
projectId = new String(in.readAllBytes(), StandardCharsets.UTF_8);
} finally {
conn.disconnect();
}
Set<String> names = new HashSet<>();
try (SecretManagerServiceClient client = SecretManagerServiceClient.create()) {
ProjectName projectName = ProjectName.of(projectId);
ListSecretsPagedResponse pagedResponse = client.listSecrets(projectName);
pagedResponse
.iterateAll()
.forEach(secret -> { names.add(secret.getName()); });
for (String secretName : names) {
String name = secretName.substring(secretName.lastIndexOf("/") + 1);
SecretVersionName nameParam = SecretVersionName.of(projectId, name, "latest");
String secretValue = client.accessSecretVersion(nameParam).getPayload().getData().toStringUtf8();
secrets.put(secretName, secretValue);
}
}
Cloud Run support for referencing Secret Manager Secrets is now at general availability (GA).
https://cloud.google.com/run/docs/release-notes#November_09_2021

aws-cdk: I cannot add access permission to existing SQS

I have a code where I need to grant send messages to an existing sqs queue.
I have this code in the aws-cdk. But this is not working. No access permission get added.
const sqsQ = sqs.Queue.fromQueueArn(this, "some-id", "arn:aws:sqs:us-east-2:SOME-ACCOUNT:QUEUE-NAME");
sqsQ.grantSendMessages(new iam.ServicePrincipal("events.amazonaws.com"));
I don't think it's possible to grant permissions to an existing resource in CDK. Anytime you import a resource into your stack using something like fromQueueArn you can think of this as a read-only reference to the resource.
In other words, you can only update resources which are managed by your CDK code.
You have basically 2 options here:
Move the original SQS into your CDK managed stack. You can do this using CloudFormation resource import feature (https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/resource-import.html)
Modify SQS permissions outside CDK in the place where it was originally defined.
Try something like this instead
sqsQ.addToResourcePolicy(
new PolicyStatement({
effect: Effect.ALLOW,
principals: [new ServicePrincipal(ServicePrincipals.EVENTS)],
actions: ["sqs:SendMessage"],
resources: [sqsQ.queueArn],
conditions: {
ArnEquals: {
"aws:SourceArn": <ruleArn or whatever needs permissions here>,
},
},
})
);

Failing to create services on Google Cloud Run with API using Java SDK

I create a Cloud Run client, however, couldn't find a way to list a service that is deployed with Cloud Run on GKE (for Anthos).
Create the client:
HttpTransport httpTransport = new NetHttpTransport();
JsonFactory jsonFactory = new JacksonFactory();
GoogleCredentials credential = GoogleCredentials.getApplicationDefault();
credential.createScoped("https://www.googleapis.com/auth/cloud-platform");
HttpRequestInitializer requestInitializer = new HttpCredentialsAdapter(credential);
CloudRun.Builder builder = new CloudRun.Builder(httpTransport, jsonFactory, requestInitializer);
return builder.setApplicationName(applicationName)
.setRootUrl(cloudRunRootUrl)
.build();
} catch (IOException e) {
e.printStackTrace();
}
try to list services:
services = cloudRun.namespaces().services()
.list("namespaces/default")
.execute()
.getItems();
My "hello" service is deploy on a GKE cluster under the namespace default. The above code doesn't work because the client always see "default" as project_id and complains about permission stuff. If I put the project_id rather than "default", permission errors are gone, but no services will be found.
I tried another project that does have Google fully-managed cloud run services, the same code returns result (with .list("namespaces/")).
How to access the service on GKE?
And my next question would be, how to programmatically create Cloud Run services on GKE?
Edit - for creating a service
As I couldn't figure out how to interact with Cloud Run on GKE, I took a step back to try fully managed one. The following code to create a service fails, and the error message just doesn't provide much useful insight, how to make it work?
Service deployedService = null;
// Map<String,String> annotations = new HashMap<>();
// annotations.put("client.knative.dev/user-image","gcr.io/cloudrun/hello");
ServiceSpec spec = new ServiceSpec();
List<Container> containers = new ArrayList<>();
containers.add(new Container().setImage("gcr.io/cloudrun/hello"));
spec.setTemplate(new RevisionTemplate().setMetadata(new ObjectMeta().setName("hello-fully-managed-v0.1.0"))
.setSpec(new RevisionSpec().setContainerConcurrency(20)
.setContainers(containers)
.setTimeoutSeconds(100)
)
);
helloService.setApiVersion("serving.knative.dev/v1")
.setMetadata(new ObjectMeta().setName("hello-fully-managed")
.setNamespace("data-infrastructure-test-env")
// .setAnnotations(annotations)
)
.setSpec(spec)
.setKind("Service");
try {
deployedService = cloudRun.namespaces().services()
.create("namespaces/data-infrastructure-test-env",helloService)
.execute();
} catch (IOException e) {
e.printStackTrace();
response.add(e.toString());
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).body(response);
}
Error message I got:
com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
{
"code" : 400,
"errors" : [ {
"domain" : "global",
"message" : "The request has errors",
"reason" : "badRequest"
} ],
"message" : "The request has errors",
"status" : "INVALID_ARGUMENT"
}
at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:150)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
And the base_url is: https://europe-west1-run.googleapis.com
Your question is quite detailed (and is about Java which I am no expert in) and there are actually too many questions in there (ideally, please ask only 1 question here). However, I'll try to answer a few things you asked:
First, Cloud Run (managed, and on GKE) both implement the Knative Serving API. I've explained this at https://ahmet.im/blog/cloud-run-is-a-knative/ In fact, Cloud Run on GKE is just the open source Knative components installed to your cluster.
And my next question would be, how to programmatically create Cloud Run services on GKE?
You will have a very hard time (if possible at all) using the Cloud Run API client libraries (e.g. new CloudRun above) because these are designed for *.googleapis.com endpoints.
The Knative API part of "Cloud Run on GKE" is actually just your Kubernetes (GKE) master API endpoint (which runs on an IP address, with a TLS certificate that isn't trusted by root CAs, but you can find the CA cert in GKE GetCluster API call to verify the cert.) The TLS is part is why it's so hard to use the API Client libraries.
Knative APIs are just Kubernetes objects. So your best bet is one of these:
See Kubernetes java client (https://github.com/kubernetes-client/java) actually allows dynamic objects. (Go implementation does) and try to use that to create Knative CRDs.
Use kubectl apply.
Ask Knative Serving open source repository for help (they should be providing client libraries, maybe they're already there I'm not sure)
To program Cloud Run (managed) with the API Client Libraries, you need to explicitly override the API endpoint to the region e.g. us-central1-run.googleapis.com. (This is documented on each API call's REST API reference documentation.)
I have written a blog post in detail (with sample code in Go) on how to create/update services on Cloud Run (managed) using the Knative Serving API here: https://ahmet.im/blog/gcloud-run-deploy/
If you want to see how gcloud run deploy works, and which APIs it calls, you can pass --log-http option to observe the request/response traffic.
As for the error you got, it seems like the error message isn't helpful, but it might be coming from anywhere (as you're trying to imitate Knative API in GCP client libraries). I recommend reading my blog posts and sample code in depth.
UPDATES: Our engineering team's looking at the issue, it appears that there's currently a bug not adding the "details" field to the error. That's being worked on.
In your case, we see the following errors from requests:
field: "spec.template.spec"
description: "Missing template spec."
Means you are not properly filling up the spec field as I shown in my blog post and sample code.
field: "metadata.name"
description: "The revision name must be prefixed by the name of the enclosing Service or Configuration with a trailing -"
Make sure the name you are specifying adheres the patterns specified in API docs. Try to create that name manually perhaps in the UI or gcloud CLI.
field: "api_version"
description: "Unsupported API version \'serving.knative.dev/v1\'. Expected \'serving.knative.dev/v1alpha1\'"
Do not use v1alpha1 API, use v1 directly.
We'll try to get the details to the error message, however it appears that you need to study the sample code I linked in my blog post more in detail:
https://github.com/GoogleCloudPlatform/cloud-run-button/blob/a52c7fbaae33a3e06c112206c7227a0ef9649647/cmd/cloudshell_open/deploy.go#L26-L112
The Java SDK is automatically generated from the fact that the Cloud Run (fully managed) API is public. It does not support Cloud Run for Anthos.
(gcloud.run.deploy) The revision name must be prefixed by the name of the enclosing Service or Configuration with a trailing -revision name
revision name name should be 65 character then problem will be resolved in Automation pipeline with GCP revision suffix should be less revision name is the combination of (service name +revision suffix) will automatically created by GCP.

Resources