Airflow 2.0 GoogleDriveHook upload_file: insufficient authentication scopes - oauth-2.0

I'm trying to use the Airflow 2.0 GoogleDriveHook to upload a local file to my Google Workspace's Drive. I'm not familiar with OAuth 2.0.
I took the following steps:
In GCP I:
Created airflow-drive project on GCP
Enabled the Google Drive API for this project
Created the airflow-drive-service-account service account and gave it the (project) "Owner" and "Service Account Token Creator" roles.
Created a json-key for airflow-drive-service-account
In Airflow I:
Created the Airflow Connection airflow-drive and specified the path of the json-key
Wrote the following code to test the upload_file method:
from airflow.providers.google.suite.hooks.drive import GoogleDriveHook
hook = GoogleDriveHook(
api_version='v3',
gcp_conn_id='airflow-drive',
delegate_to=None,
impersonation_chain=None
)
hook.upload_file(
local_location='some_file.txt',
remote_location='/some_file.txt'
)
This is the error I receive upon running the script:
[2021-01-26 10:01:08,286] {http.py:126} WARNING - Encountered 403 Forbidden with reason "insufficientPermissions"
Traceback (most recent call last):
File "test_drive_hook.py", line 11, in <module>
hook.upload_file(
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/google/suite/hooks/drive.py", line 144, in upload_file
service.files() # pylint: disable=no-member
File "/home/airflow/.local/lib/python3.8/site-packages/googleapiclient/_helpers.py", line 134, in positional_wrapper
return wrapped(*args, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/googleapiclient/http.py", line 915, in execute
raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 403 when requesting https://www.googleapis.com/upload/drive/v3/files?fields=id&alt=json&uploadType=multipart returned "Insufficient Permission: Request had insufficient authentication scopes.". Details: "Insufficient Permission: Request had insufficient authentication scopes.">
Questions:
I don't understand where I have to configure the authentication scopes. This SO question talks about defining scopes using a SCOPES variable. But I'm not sure how I should do this for Airflow / Google Drive API.
It's unclear whether I have to use the delegate_to and/or impersonation_chain parameters for my use case. This Airflow issue touches this subject but doesn't clarify whether I need Domain-Wide Delegation for my use case.

Okay, the issue has been solved.
Here is the solution:
My approach above was correct. I just needed to add the https://www.googleapis.com/auth/drive scope in the Scopes field in the Airflow Connection:

Related

How to set the scope using Google Operators in Airflow

I have a task using the GCSToGoogleSheetsOperator in Airflow where Im trying to add data to a sheet.
I have added the service credential email to the sheet I want to edit with editor privileges, and received this error:
googleapiclient.errors.HttpError:
<HttpError 403 when requesting
https://sheets.googleapis.com/v4/spreadsheets/<SHEET_ID>/values/Sheet1?valueInputOption=RAW&includeValuesInResponse=false&responseValueRenderOption=FORMATTED_VALUE&responseDateTimeRenderOption=SERIAL_NUMBER&alt=json
returned "Request had insufficient authentication scopes.".
Details: "[{
'#type': 'type.googleapis.com/google.rpc.ErrorInfo',
'reason': 'ACCESS_TOKEN_SCOPE_INSUFFICIENT',
'domain': 'googleapis.com',
'metadata': {
'service': 'sheets.googleapis.com',
'method': 'google.apps.sheets.v4.SpreadsheetsService.UpdateValues'}
}]>
I cant update the sheet, but the GCS and BigQuery operators work fine.
My connection configuration looks like the following:
AIRFLOW_CONN_GOOGLE_CLOUD=google-cloud-platform://?extra__google_cloud_platform__key_path=%2Fopt%2Fairflow%2Fcredentials%2Fgoogle_credential.json
I tried following the instructions to add the scope https://www.googleapis.com/auth/spreadsheets.
Which URL encoded looks like:
AIRFLOW_CONN_GOOGLE_CLOUD=google-cloud-platform://?extra__google_cloud_platform__key_path=%2Fopt%2Fairflow%2Fcredentials%2Fgoogle_credential.json&extra__google_cloud_platform__scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fspreadsheets
Now, operators which previously worked error out like this:
google.api_core.exceptions.Forbidden: 403 POST https://bigquery.googleapis.com/bigquery/v2/projects/my-project/jobs?prettyPrint=false: Request had insufficient authentication scopes.
And the GCSToGoogleSheetsOperator operator still error out like this:
google.api_core.exceptions.Forbidden: 403 GET https://storage.googleapis.com/download/storage/v1/b/my-bucket/o/folder%2Fobject.csv?alt=media: Insufficient Permission: ('Request failed with status code', 403, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)
How can I set the permissions correctly to use both BigQuery, GCS and Sheets operators?
Adding a scope seems to ignore the IAM roles, so its either one or the other.
The service account had roles needed to access GCS and BigQuery, but by adding the scope https://www.googleapis.com/auth/spreadsheets, the service would ignore the privileges granted by the roles and look only at the ones specified by the scopes.
So, to recover it, you must add both the spreadsheet and cloud-platform scopes (or more strict scopes). cloud-platform will provide access to GCS and BigQuery and spreadsheets to Google Sheets API.
If you set your connection using environment variables, you have to URL encode the arguments, so to create a GOOGLE_CLOUD connection, you will have to do something like this, which is not encoded...
AIRFLOW_CONN_GOOGLE_CLOUD=google-cloud-platform://?extra__google_cloud_platform__key_path=/abs/path_to_file/credential.json&extra__google_cloud_platform__scope=https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/spreadsheets
To encode, which is the version you have to use, replace /, , and ::
AIRFLOW_CONN_GOOGLE_CLOUD=google-cloud-platform://?extra__google_cloud_platform__key_path=%2Fabs%2Fpath_to_file%2Fcredentials%2Fgoshare-driver-c08e0904285b.json&extra__google_cloud_platform__scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fspreadsheets

Google::Apis::ClientError: forbidden: The caller does not have permission when using service account call with ruby google api

I'm trying use a service account with google's api to work with classroom data with the goal of synchronizing our web service for schools with the google classroom data.
I have delegated domain wide authority to the service account and have activated the Google Classroom API. I have downloaded the json Key file used below.
I have added https://www.googleapis.com/auth/classroom.courses to the scope of the service account.
My test code in app/models/g_service.rb:
class GService
require 'google/apis/classroom_v1'
def get_course
authorizer = Google::Auth::ServiceAccountCredentials.make_creds(
json_key_io: File.open('/Users/jose/Downloads/skt1-301603-4a655caa8963.json'),
scope: [ Google::Apis::ClassroomV1::AUTH_CLASSROOM_COURSES ]
)
authorizer.fetch_access_token!
service = Google::Apis::ClassroomV1::ClassroomService.new
service.authorization = authorizer
puts "\n service\n #{service.inspect}"
response = service.get_course( '99999' )
puts "\n response \n#{response.inspect}"
end
end
The results in the console are:
>> GService.new.get_course
service
#<Google::Apis::ClassroomV1::ClassroomService:0x007fe1cff98338 #root_url="https://classroom.googleapis.com/", #base_path="", #upload_path="upload/", #batch_path="batch", #client_options=#<struct Google::Apis::ClientOptions application_name="unknown", application_version="0.0.0", proxy_url=nil, open_timeout_sec=nil, read_timeout_sec=nil, send_timeout_sec=nil, log_http_requests=false, transparent_gzip_decompression=true>, #request_options=#<struct Google::Apis::RequestOptions authorization=#<Google::Auth::ServiceAccountCredentials:0x0xxxxxxx #project_id="sssssssss", #authorization_uri=nil, #token_credential_uri=#<Addressable::URI:0x000000000 URI:https://www.googleapis.com/oauth2/v4/token>, #client_id=nil, #client_secret=nil, #code=nil, #expires_at=2021-01-13 20:56:46 -0800, #issued_at=2021-01-13 19:56:47 -0800, #issuer="xxxxxxx.iam.gserviceaccount.com", #password=nil, #principal=nil, #redirect_uri=nil, #scope=["https://www.googleapis.com/auth/classroom.courses"], #state=nil, #username=nil, #access_type=:offline, #expiry=60, #audience="https://www.googleapis.com/oauth2/v4/token", #signing_key=#<OpenSSL::PKey::RSA:0xxxxxxxxx>, #extension_parameters={}, #additional_parameters={}, #connection_info=nil, #grant_type=nil, #refresh_token=nil, #access_token="-------------------------------------------------------------------------------------->, retries=0, header=nil, normalize_unicode=false, skip_serialization=false, skip_deserialization=false, api_format_version=nil, use_opencensus=true>>
Google::Apis::ClientError: forbidden: The caller does not have permission
It appears everything is working fine until the service.get_course('99999') call.
I've tested this call using the https://developers.google.com/classroom/reference/rest/v1/courses/get online tool and it works fine.
I've poured over the documentation but have been unable to resolve this.
Can anybody please let me know what I am missing?
I'm running rails 3.2 and ruby 2.1
Considering the error you're getting, I think you are not impersonating any account.
The purpose of domain-wide delegation is that the service account can impersonate a regular account in your domain, but in order to do that, you have to specify which account you want to impersonate. Otherwise, you are calling the service account by itself, and it doesn't matter that you've enabled domain-wide delegation for it.
In the Ruby library, you can specify that using the :sub parameter, as shown in the section Preparing to make an authorized API call at the library docs:
authorizer.sub = "<email-address-to-impersonate>"
Note:
Make sure the account you impersonate has access to this course, otherwise you'll get the same error.
Related:
Delegating domain-wide authority to the service account
Google API Server-to-Server Communication not working (Ruby implementation)

Failing to create services on Google Cloud Run with API using Java SDK

I create a Cloud Run client, however, couldn't find a way to list a service that is deployed with Cloud Run on GKE (for Anthos).
Create the client:
HttpTransport httpTransport = new NetHttpTransport();
JsonFactory jsonFactory = new JacksonFactory();
GoogleCredentials credential = GoogleCredentials.getApplicationDefault();
credential.createScoped("https://www.googleapis.com/auth/cloud-platform");
HttpRequestInitializer requestInitializer = new HttpCredentialsAdapter(credential);
CloudRun.Builder builder = new CloudRun.Builder(httpTransport, jsonFactory, requestInitializer);
return builder.setApplicationName(applicationName)
.setRootUrl(cloudRunRootUrl)
.build();
} catch (IOException e) {
e.printStackTrace();
}
try to list services:
services = cloudRun.namespaces().services()
.list("namespaces/default")
.execute()
.getItems();
My "hello" service is deploy on a GKE cluster under the namespace default. The above code doesn't work because the client always see "default" as project_id and complains about permission stuff. If I put the project_id rather than "default", permission errors are gone, but no services will be found.
I tried another project that does have Google fully-managed cloud run services, the same code returns result (with .list("namespaces/")).
How to access the service on GKE?
And my next question would be, how to programmatically create Cloud Run services on GKE?
Edit - for creating a service
As I couldn't figure out how to interact with Cloud Run on GKE, I took a step back to try fully managed one. The following code to create a service fails, and the error message just doesn't provide much useful insight, how to make it work?
Service deployedService = null;
// Map<String,String> annotations = new HashMap<>();
// annotations.put("client.knative.dev/user-image","gcr.io/cloudrun/hello");
ServiceSpec spec = new ServiceSpec();
List<Container> containers = new ArrayList<>();
containers.add(new Container().setImage("gcr.io/cloudrun/hello"));
spec.setTemplate(new RevisionTemplate().setMetadata(new ObjectMeta().setName("hello-fully-managed-v0.1.0"))
.setSpec(new RevisionSpec().setContainerConcurrency(20)
.setContainers(containers)
.setTimeoutSeconds(100)
)
);
helloService.setApiVersion("serving.knative.dev/v1")
.setMetadata(new ObjectMeta().setName("hello-fully-managed")
.setNamespace("data-infrastructure-test-env")
// .setAnnotations(annotations)
)
.setSpec(spec)
.setKind("Service");
try {
deployedService = cloudRun.namespaces().services()
.create("namespaces/data-infrastructure-test-env",helloService)
.execute();
} catch (IOException e) {
e.printStackTrace();
response.add(e.toString());
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).body(response);
}
Error message I got:
com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
{
"code" : 400,
"errors" : [ {
"domain" : "global",
"message" : "The request has errors",
"reason" : "badRequest"
} ],
"message" : "The request has errors",
"status" : "INVALID_ARGUMENT"
}
at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:150)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
And the base_url is: https://europe-west1-run.googleapis.com
Your question is quite detailed (and is about Java which I am no expert in) and there are actually too many questions in there (ideally, please ask only 1 question here). However, I'll try to answer a few things you asked:
First, Cloud Run (managed, and on GKE) both implement the Knative Serving API. I've explained this at https://ahmet.im/blog/cloud-run-is-a-knative/ In fact, Cloud Run on GKE is just the open source Knative components installed to your cluster.
And my next question would be, how to programmatically create Cloud Run services on GKE?
You will have a very hard time (if possible at all) using the Cloud Run API client libraries (e.g. new CloudRun above) because these are designed for *.googleapis.com endpoints.
The Knative API part of "Cloud Run on GKE" is actually just your Kubernetes (GKE) master API endpoint (which runs on an IP address, with a TLS certificate that isn't trusted by root CAs, but you can find the CA cert in GKE GetCluster API call to verify the cert.) The TLS is part is why it's so hard to use the API Client libraries.
Knative APIs are just Kubernetes objects. So your best bet is one of these:
See Kubernetes java client (https://github.com/kubernetes-client/java) actually allows dynamic objects. (Go implementation does) and try to use that to create Knative CRDs.
Use kubectl apply.
Ask Knative Serving open source repository for help (they should be providing client libraries, maybe they're already there I'm not sure)
To program Cloud Run (managed) with the API Client Libraries, you need to explicitly override the API endpoint to the region e.g. us-central1-run.googleapis.com. (This is documented on each API call's REST API reference documentation.)
I have written a blog post in detail (with sample code in Go) on how to create/update services on Cloud Run (managed) using the Knative Serving API here: https://ahmet.im/blog/gcloud-run-deploy/
If you want to see how gcloud run deploy works, and which APIs it calls, you can pass --log-http option to observe the request/response traffic.
As for the error you got, it seems like the error message isn't helpful, but it might be coming from anywhere (as you're trying to imitate Knative API in GCP client libraries). I recommend reading my blog posts and sample code in depth.
UPDATES: Our engineering team's looking at the issue, it appears that there's currently a bug not adding the "details" field to the error. That's being worked on.
In your case, we see the following errors from requests:
field: "spec.template.spec"
description: "Missing template spec."
Means you are not properly filling up the spec field as I shown in my blog post and sample code.
field: "metadata.name"
description: "The revision name must be prefixed by the name of the enclosing Service or Configuration with a trailing -"
Make sure the name you are specifying adheres the patterns specified in API docs. Try to create that name manually perhaps in the UI or gcloud CLI.
field: "api_version"
description: "Unsupported API version \'serving.knative.dev/v1\'. Expected \'serving.knative.dev/v1alpha1\'"
Do not use v1alpha1 API, use v1 directly.
We'll try to get the details to the error message, however it appears that you need to study the sample code I linked in my blog post more in detail:
https://github.com/GoogleCloudPlatform/cloud-run-button/blob/a52c7fbaae33a3e06c112206c7227a0ef9649647/cmd/cloudshell_open/deploy.go#L26-L112
The Java SDK is automatically generated from the fact that the Cloud Run (fully managed) API is public. It does not support Cloud Run for Anthos.
(gcloud.run.deploy) The revision name must be prefixed by the name of the enclosing Service or Configuration with a trailing -revision name
revision name name should be 65 character then problem will be resolved in Automation pipeline with GCP revision suffix should be less revision name is the combination of (service name +revision suffix) will automatically created by GCP.

Authenticate to Gsuite APIs without service account

I'm trying to read Google Spreadsheet using my local credentials (i.e. not service account). I'm basically doing:
from googleapiclient.discovery import build
service = build('sheets', 'v4')
request = sheet_service.spreadsheets().values().get(spreadsheetId=SPREADSHEET_ID, range='{}!A1:a4'.format(SHEET_NAME))
result = request.execute()
values = result.get('values', [])
However I get the following error:
Traceback (most recent call last):
File "./update_incidents_tracker.py", line 54, in <module>
sys.exit(main())
File "./update_incidents_tracker.py", line 48, in main
result = request.execute()
File "/home/filip/.virtualenvs/monitoring-tools/lib/python3.6/site-packages/googleapiclient/_helpers.py", line 134, in positional_wrapper
return wrapped(*args, **kwargs)
File "/home/filip/.virtualenvs/monitoring-tools/lib/python3.6/site-packages/googleapiclient/http.py", line 898, in execute
raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 403 when requesting https://sheets.googleapis.com/v4/spreadsheets/1lTcQ3WknG_2oZvy9O9LEbgAzoykOWaheYSaDkkV21wE/values/Incidents%21A1%3Aa4?alt=json returned "Request had insufficient authentication scopes.">
I can't find a way to set scopes for my local (gcloud SDK) credentials. How can I set proper scopes?
For reference for service account based authentication I can simply run:
credentials = service_account.Credentials.from_service_account_file(CREDENTIALS_FILE, scopes=SCOPES)
build('sheets', 'v4', credentials=credentials)
Use the default credential explicitally, and set the scope at this time
import google.auth
credentials, project_id = google.auth.default(scopes=....)
build('sheets', 'v4', credentials=credentials)
....
Full documentation of google-auth here

ejabberd - Configuration of mod_http_api

I'm in the midst of testing mod_http_api to replace the existing usage of mod_rest in our implementation.
I can unrestrict access to some commands from group of IP addresses by using option "admin_ip_access". I can successfully execute some commands (e.g. change_password).
However, for some cases, we may require login as well for both user (own)and admin(own and other user).
However, when I tried to login with Basic Auth. It's not successful. I'm keep on getting the following. If my assumption is correct, this might be related to configuration.
Will be much appreciated if someone could show me how the correct configuration should be done.
{
"status": "error",
"code": 31,
"message": "Command need to be run with admin priviledge."
}
Current config
modules:
mod_http_api:
admin_ip_access: admin_ip_access_rule
acl:
admin_ip_acl:
ip:
- "xx.xx.xx.xx/32"
access:
admin_ip_access_rule:
admin_ip_acl:
- all
EDIT
For testing purpose, I've enabled the following configuration:
commands_admin_access: configure
commands:
- add_commands:
- status
- get_roster
- change_password
- register
- unregister
- registered_users
- muc_online_rooms
- oauth_issue_token
I able to run both of user and admin commands successfully for those listed commands inside add_commands tags. It works as expected. However, I still facing some issues, most related to the IP restriction. Calling the API from the host that is not listed in admin_ip_acl also successful where I expect to get error when calling for non-whitelited host
The API requires an OAuth token for authentication. You need to generate one with correct scope. When a command is restricted to an admin, you need to also pass the HTTP header: "X-Admin: true" to let ejabberd know that it should consider you would like to act as an admin.

Resources