How does the dataflow authenticate the worker service account? - google-cloud-dataflow

I created a service account in project A to use as a worker service account for dataflow.
I specify the worker service account in dataflow's options
I've looked for an dataflow's option to specify Service account keys for the worker service account, but can't find it.
I ran it with the following program arguments and it worked fine.I ran it with a service account that is different from the worker service account that exists in project A.
--project=projectA --serviceAccount=my-service-account-name#projectA.iam.gserviceaccount.com
I didn't load the Json credentials file for the worker service account in my Apache Beam application.
And I haven't specified the service account key for the worker service account in the dataflow options.
How does the dataflow authenticate the worker service account?

Please take a look at Dataflow security and permissions -> Security and permissions for pipelines on Google Cloud.
It uses the project's Compute Engine default service account as the worker service account by default.

Related

Permission error in GCP when creating a new compute instance but service account does have permissions

I am running a cloudbuild.yaml job in Google Cloud Platform that builds, pushes and tags a Docker Image and then it creates a Compute Engine instance to run that image via gcr.io/cloud-builders/gcloud.create-with-container. I also specify a service account to be used in this step:
- id: "Create Compute Engine instance"
name: gcr.io/cloud-builders/gcloud
args: [
'compute',
'instances',
'create-with-container',
'${INSTANCE_NAME}',
'--container-image',
'eu.gcr.io/${PROJECT_ID}/${PROJECT_ID}-${REPO_NAME}',
'--zone',
'${ZONE}',
'--service-account',
'${SERVICE_ACCOUNT},
'--machine-type',
'n2-standard-4'
]
However I am getting an error:
Already have image (with digest): gcr.io/cloud-builders/gcloud
ERROR: (gcloud.compute.instances.create-with-container) Could not fetch resource:
- Required 'compute.instances.create' permission for 'projects/...'
The service account in use does have the permissions for that as it has been assigned "role": "roles/compute.instanceAdmin.v1", which includes compute.instances.* as per documentation.
Anyone has experienced this or a similar situation and can give a hint on how to proceed? Am I missing something obvious? I have tried using other service accounts, including the project default compute account and get the same error. One thing to note is I do not specify a service account for Docker steps (gcr.io/cloud-builders/docker).
Make sure that you are not misinterpreting service accounts.
There is a special service account used by Cloud Build.
There is also the service account to "be used" by the VM/instance you are creating.
The "compute.instances.create" permission should be granted to the special Cloud Build account, not to the account for the instance.
The Cloud Build account has a name like 123123123#cloudbuild.gserviceaccount.com.
In the Cloud Console go to Cloud Build -> Settings -> Service Accounts
and check if correct permissions are granted.

GCP Cloud Run Cannot Pull Image from Artifact Registry in Other Project

I have a parent project that has an artifact registry configured for docker.
A child project has a cloud run service that needs to pull its image from the parent.
The child project also has a service account that is authorized to access the repository via an IAM role roles/artifactregistry.writer.
When I try to start my service I get an error message:
Google Cloud Run Service Agent must have permission to read the image,
europe-west1-docker.pkg.dev/test-parent-project/docker-webank-private/node:custom-1.
Ensure that the provided container image URL is correct and that the
above account has permission to access the image. If you just enabled
the Cloud Run API, the permissions might take a few minutes to
propagate. Note that the image is from project [test-parent-project], which
is not the same as this project [test-child-project]. Permission must be
granted to the Google Cloud Run Service Agent from this project.
I have tested manually connecting with docker login and using the service account's private key and the docker pull command works perfectly from my PC.
cat $GOOGLE_APPLICATION_CREDENTIALS | docker login -u _json_key --password-stdin https://europe-west1-docker.pkg.dev
> Login succeeded
docker pull europe-west1-docker.pkg.dev/bfb-cicd-inno0/docker-webank-private/node:custom-1
> OK
The service account is also attached to the cloud run service:
You have 2 types of service account used in Cloud Run:
The Google Cloud Run API service account
The Runtime service account.
In your explanation, and your screenshot, you talk about the runtime service account, the identity that will be used by the service when it runs and call Google Cloud API.
BUT before running, the service must be deployed. This time, it's a Google Cloud Run internal process that run to pull the container, create a revision and do all the required internal stuff. To do that job, a service account also exist, it's named "service agent".
In the IAM console, you can find it: the format is the following
service-<PROJECT_NUMBER>#serverless-robot-prod.iam.gserviceaccount.com
Don't forget to tick the checkbox in the upper right corner to include the Google Managed service account
If you want that this deployment service account be able to pull image in another project, grant on it the correct permission, not on the runtime service account.

Azure RBAC and AKS not working as expected

I have create an AKS Cluster with AKS-managed Azure Active Directory and Role-based access control (RBAC) Enabled.
If I try to connect with the Cluster by using one of the accounts which are included in the Admin Azure AD groups everything works as it should.
I am having some difficulties when i try to do this with a user which is not a member of Admin Azure AD groups. What I did is the following:
created a new user
assigned the roles Azure Kubernetes Service Cluster User Role and Azure Kubernetes Service RBAC Reader to this user.
Execute the following command: az aks get-credentials --resource-group RG1 --name aksttest
When I then execute the following command: kubectl get pods -n test I get the following error: Error from server (Forbidden): pods is forbidden: User "aksthree#tenantname.onmicrosoft.com" cannot list resource "pods" in API group "" in the namespace "test"
In the Cluster I haven't done any RoleBinding. According to the docu from Microsoft, there is no additional task that should be done in the Cluster ( like for ex. Role definition and RoleBinding).
My expectation is that when a user has the above two roles assigned he should be able to have read rights in the Cluster. Am I doing something wrong?
Please let me know what you think,
Thanks in advance,
Mike
When you use AKS-managed Azure Active Directory, it enables authentication as AD user but authorization happens in Kubernetes RBAC
only, so, you have to separately configure Azure IAM and Kubernetes RBAC. For example, it adds the aks-cluster-admin-binding-aad ClusterRoleBinding which provides access to accounts which are included in the Admin Azure AD groups.
The Azure Kubernetes Service RBAC Reader role is applicable for Azure RBAC for Kubernetes Authorization which is feature on top of AKS-managed Azure Active Directory, where both authentication and authorization happen with AD and Azure RBAC. It uses Webhook Token Authentication technique at API server to verify tokens.
You can enable Azure RBAC for Kubernetes Authorization on existing cluster which already has AAD integration:
az aks update -g <myResourceGroup> -n <myAKSCluster> --enable-azure-rbac

How to determine service account used to run Dataflow job?

My Dataflow job fails when it tries to access a secret:
"Exception in thread "main" com.google.api.gax.rpc.PermissionDeniedException: io.grpc.StatusRuntimeException: PERMISSION_DENIED: Permission 'secretmanager.versions.access' denied for resource 'projects/REDACTED/secrets/REDACTED/versions/latest' (or it may not exist)."
I launch the job using gcloud dataflow flex-template run. I am able to view the secret in the console. The same code works when I run it on my laptop. As I understand it, when I submit a job with the above command, it runs under a service account that may have different permissions. How do I determine which service account the job runs under?
Since Dataflow creates workers, they create instances. You can check this on Logging
Open GCP console
Open Logging -> Logs Explorer (make sure you are not using the "Legacy Logs Viewer")
At the query builder type in protoPayload.serviceName="compute.googleapis.com"
Click Run Query
Expand the entry for v1.compute_instances.create or any other resources used by compute.googleapis.com
You should be able to see the service account used for creating the instance. This service account (boxed in red) is used anything related to the running the dataflow job.
Take note that I tested this using the official dataflow quick start.
By default the worker nodes of dataflow run with the compute engine default service account (YOUR_PROJECT_NUMBER-compute#developer.gserviceaccount.com) lacking of the "Secret Manager Secret Accessor" rights.
Either you need to add those rights to the service account or you have to specify the service account in the pipeline options:
gcloud dataflow flex-template run ... --parameters service_account_email="your-service-account-name#YOUR_PROJECT_NUMBER.iam.gserviceaccount.com"

Running a pod as a service account to connect to a database with Integrated Security

I have a .NET Core service running on Azure Kubernetes Service and a Linux Docker image. It needs to connect to an on-premise database with Integrated Security. One of the service accounts in my on-premise AD has access to this database.
My question is - is it possible to run a pod under a specific service account so the service can connect to the database? (Other approach I took was to impersonate the call with WindowsIdentity.RunImpersonated, however that requires the DLL "advapi32.dll" and I couldn't find a way to deploy it to the Linux container and make it run.)
A pod can run with the permissions of an Azure Active Directory service account if you install and implement AAD Pod Identity components in your cluster.
You'll need to set up an AzureIdentity and an AzureIdentityBinding resource in your cluster then add a label to the pod(s) that will use permissions associated with the service account.
Please note that this approach relies on the managed identity or service principal associated with your cluster having the role "Managed Identity Operator" granted against the service account used to access SQL Server (service account must exist in Azure Active Directory).
I suspect you may have a requirement for the pods to take on the identity of a "group managed service account" which exists in your local AD only. I don't think this is supported in Linux containers (Recently, Windows nodes support GMSAs as a GA feature).

Resources