I have to disable the composer, and cloud data fusion APIs and enable them back so i get the default service accounts recreated. There is a problem with data fusion service account which i'm unable to recover from.
When removing the environments, i'm unable to delete Composer Environment. I dont get past this error message:
Error Message:
DELETE operation on this environment failed 2 minutes ago with the following error message:
RPC Skipped due to required preoperation not finished yet.
What i have tried is : Cloud Composer is not getting deleted
Any help on this please?
What worked for me finally was to disable the composer API and enable it back. Then try to delete the composer environment. Thanks to Unable to delete gcloud composer environment
Related
I am trying to deploy my first stream APP via the spring cloud dataflow dashboard, but I keep getting the "Failed to create stream" error in the UI. Can someone help me investigate what might be wrong?
I am running SCDF on kubernetes and my deployment consists of the following components:
scdf-server
skipper
mariadb
rabbitmq
My stream is the simple time | log example
Try using kubectl on the scdf-server pod to see if it provides any information. I've seen that error occur if an app I deployed was not accessible - in my case, I'd referenced it by an incorrect filepath which didn't get caught by the server until it tried to deploy the stream.
It could be failing at any point in the deploy. To gain some insight, you can view the events and logs on each pod w/ the following commands:
kubectl describe pods/<pod-name>
kubectl logs pods/<pod_name>
Our Composer instance dropped all its active workers in the middle of the day. Node memory and cpu utilization disappeared for 2 out of 3 nodes.
First errors were:
_mysql_exceptions.OperationalError: (2006, "Can't connect to MySQL server on 'airflow-sqlproxy-service.default.svc.cluster.local' (110))"
Restarting Composer instance (with a dummy env variable) does not help, gives the below error:
Killing GKE workers in error does not help either. Stackdriver has this:
ERROR: (gcloud.container.clusters.describe) You do not currently have an active account selected.)
And another error seems to point to internal Google authentication service problem:
ERROR: (gcloud.container.clusters.get-credentials) There was a problem refreshing your current auth tokens: Unable to find the server at metadata.google.internal)
The Composer storage bucket seems to have 'Storage Legacy Bucket ...' permissions for some service accounts. Some changes going on in the authentication backend or what could be the underlying cause of the sudden and weird freeze?
Versions are composer-1.8.2 and airflow-1.10.3.
We have the following scenario:
Current working setup
Web API project using a single DockerFile
A release pipe line with an 'Azure App Service deploy' task.
Proposed new setup
Web API project using multi container Docker Compose file
A release pipe line with an 'Azure Web App for Containers' task.
Upon deploying the new setup we receive the below error message:
ERROR - multi-container unit was not started successfully
Unhandled exception. System.AggregateException: One or more errors occurred.
(Parameters: Connection String: XXX, Resource: https://vault.azure.net, Authority:
https://login.windows.net/xxxxx. Exception Message:
Tried to get token using Managed Service Identity.
Access token could not be acquired. Connection refused)
The exception thrown is because it can't connect to Azure MSI (Managed Service Identity). It does this to obtain a token before connecting to key vault.
I have tried the following based upon some research and solutions others have found:
Connecting with "RunAs=App" (this seems to be the default parameter-less constructor anyway)
Building up the connection string myself manually by pulling the "MSI_SECRET" environment variable from the machine. This is always blank.
Restarting MSI.
Upgrading and downgrading AppAuthentication package
MSI appears to be configured correctly as it works perfectly with our current working setup so we can rule that out.
It's worth noting that this is System assigned identity not a user assigned one.
The documentation that states which services support managed identites only mentions 'Azure Container Instances' not 'Azure Managed Container Instances' and that is for Linux/Preview too so that it could be not supported.
Services that support managed identities for Azure resources
We've spent a considerable amount of time getting to this point with the configuration and deployment and it would be great if we could resolve this last issue.
Any help appreciated.
Unfortunately, there currently is no multi-container support for managed identities. The multi-container feature is in preview and so does not have all its functionality working yet.
However, the documentation you linked to is also not as clear about the supported scenarios, so I am working on getting this documentation updated to better clarify this. I can update this answer once that's done.
When trying to deploy my container (or the hello world container) to google cloud run I receive this error:
ERROR: (gcloud.run.deploy) Cloud Run error: Internal system error. Missing necessary permission for service-<ID>#serverless-robot-prod.iam.gserviceaccount.com on resource <PROJECT ID>
I can see that the service account mentioned in the error is in my IAM dashboard and has the Google Cloud Run Service Agent role. I even tried giving it the Owner role, but it didn't work.
I tried including the --service-account flag with the same service account and receive this error:
PERMISSION_DENIED: Permission 'iam.serviceaccounts.actAs' denied on service account service-<ID>#serverless-robot-prod.iam.gserviceaccount.com (or it may not exist).
Which I know doesn't make sense.
I also tried this deploy through the console ui, but received the same error (the first one).
How do I fix this permission error?
I order to assign the iam.serviceAccounts.actAs permission you have to set the roles/iam.serviceAccountUser role.
You can do this by going to the Console > IAM & Admin and setting the Service Account User role to your service account.
Also, confirm that the Cloud Run runtime service account also has the iam.serviceAccounts.actAs permission. This is a requirement specified in the Cloud Run deployment permissions docs
As Dustin mentioned, there was an outage affecting IAM permissions. Now that the outage has been resolved, my deployment is working!
Ejabberd Clustering:
I have set up two Ejabberd servers in two different Digital Ocean Droplets.
And i am trying to build clustering on these two servers.
I followed the documentation in the Ejabberd official Docs i.e, 'https://docs.ejabberd.im/admin/guide/clustering/'
Copy the /home/ejabberd/.erlang.cookie file from ejabberd01 to ejabberd02.
Made sure my new ejabberd node is properly configured. My ejabberd.yml config file on the new node that on the other cluster nodes have same configs.
Then when i tried to start the clustering with the below command:
$ ejabberdctl --no-timeout join_cluster 'ejabberd#ejabberd01'
I get the below Error:
args: []
format: "Error when reading /opt/ejabberd/.erlang.cookie: eacces"
label: {error_logger,error_msg}
Please help me solve this issue.
Thank you in advance
That eacess thing in the error message is actually the EACCESS error return code standardized by POSIX:
[EACCES]
Permission denied.
An attempt was made to access a file in a way forbidden by its file access permissions.
In other words, the credentials which the Erlang BEAM process running your ejabberd node uses, are insufficient to open the Erlang cookie file /opt/ejabberd/.erlang.cookie.
You can start here to get more background on what Erlang cookies are.