Azure Application gateway fails with terminal provisioning state "Failed" - azure-application-gateway

I am deploying azure application gateway (internal) with V2, it succeeded couple of times in other subscriptions (Environments), however, it is failing with strange error and without much details about the error.
deployment fails after 30 mins of applying/creating
there is a UDR but which is for different purpose and not blocking or restricting the default internet route
The deployment is using terraform and everything worked well in other instances deployment

I tried to reproduce the same in my environment and got the same error like below.
"details": [
{
"code": "Conflict",
"message": "{\r\n \"status\": \"Failed\",\r\n \"error\": {\r\n \"code\": \"ResourceDeploymentFailure\",\r\n \"message\": \"![The resource operation completed with terminal provisioning state 'Failed](https://i.imgur.com/eipLRgp.png)'.\"\r\n }\r\n}"
}
]
This issue generally occurs, when an unsupported route typically a 0.0.0.0/0 route to a firewall being advertised via BGP is affecting the Application Gateway Subnet.
Try to deploy with a default vnet and manage subnet configuration like below:
When I tried to deploy, Azure Application gateway deployment succeeded successfully like below:
If your deployment fails after 30 mins you can make use of diagnose logs to check error messages in any logs pertaining to the unsuccessful procedure.
Once you determine the cause of the issue diagnosis will guide you to take the necessary steps and fix the issue. Resolving network issues, depending on the cause of the failure.

Found the issue and resolution
Raised a Microsoft case to see the logs of the APPGW at the platform level
Microsoft verified the logs and identified that AppGW is not able to communicate with the Keyvault to read the ssl certificate as we are using Keyvault to store ssl cert for TLS encryption
Found out that subnet to subnet communication is blocked and hence AppGW is unable to communicate with KV in another subnet
Resolution:
Allowed subnet to subnet communication where appgw and kv are present
Conclusion:
Microsoft would have enabled better logging information (error details) in the AppGW resource deployment and or resource activity logs

Related

access azure key vault from azure web app where ip changes often bc of CI/CD

I have a docker container that accesses azure key vault. this works when I run it locally.
I set up an azure web app to host my container, and it cannot access the key vault
Forbidden (HTTP 403). Failed to complete operation. Message:
Client address is not authorized and caller is not a trusted service.
Client address: 51.142.174.224 Caller:
I followed the suggestion from https://www.youtube.com/watch?v=QIXbyInGXd8 and
I went to the web app in the portal to set status to on
Created an access policy
and then receive the same error with a different ip
Forbidden (HTTP 403). Failed to complete operation. Message:
Client address is not authorized and caller is not a trusted service.
Client address: 4.234.201.129 Caller:
My web app ip address would change every time an update were made, so are there any suggestions how to overcome this?
It might depend on your exact use case and what you want to achieve with your tests, but you could consider using a test double instead of the real Azure Key Vault while running your app locally or on CI.
If you are interested please feel free to check out Lowkey Vault.
I found solution by setting up a virtual network,
and then whitelisting it in the keyvault access rights

Failed to connect Hyperledger Explorer to Fabric project

I have a Fabric project up and running with 7 org/5 channel setup with each org having 2 peers. Everything is up and running. Now i am trying to connect Hyperledger Explorer to view the blockchain data. However there is an issue i am facing in the configuration part.
Steps i performed:
Pulled the images and added the following containers in a single docker-compose.yaml file for startup: hyperledger/explorer-db:latest, hyperledger/explorer:latest, prom/prometheus:latest, grafana/grafana:latest
Edited the created containers with the respective configurations needed and volume mounts.
volumes:
./config.json:/opt/explorer/app/platform/fabric/config.json
./connection-profile:/opt/explorer/app/platform/fabric/connection-profile/
./crypto-config:/tmp/crypto
walletstore:/opt/wallet
Since its a multi-org setup i edited the config.json files and accordingly pointed them to the respective connection profiles as per the organization setup
{
"network-configs": {
"org1-network": {
"name": "Sample-1",
"profile": "./connection-profile/org1-network.json"
}, and so on for other orgs
Edited the prometheus.yml to put in the static configurations
static_configs:
targets: ['localhost:8443','localhost:8444', and so on for every peer service]
targets: ['orderer0-service:8443','orderer1-service:8444', and so on for every orderer service]
Edited the peer services in my docker-compose.yaml file to add in the below values on each peer config
CORE_OPERATIONS_LISTENADDRESS=0.0.0.0:9449 # RESTful API for Hyperledger Explorer
CORE_METRICS_PROVIDER=prometheus # Prometheus will pull metrics
Issue: (Now resolved - see below)
It seems that explorer isn't able to find my Admin#org1-cert.pem' path in the given location. But i double checked everything and that particular path is present and also accessible. All permissions to that path is also open to avoid any permissioning issue.
Path in question [Full path is provided not the relative path]: /home/auro/Desktop/HLF/fabricapp/crypto-config/peerOrganizations/org1/users/Admin#org1/msp/signcerts/Admin#org1-cert.pem
The config files is also setup properly. I am unable to find a way to correct way. Would be really glad if someone can tell me what is going on with this path issue, because i tried everything i think i could but still not able to get it working.
Other details:
Using Hypereldger Explorer - v1.1.0 - Pulling the latest docker image
Using Hyperledger Fabric - v.1.4.6 - Pulling the specific version from docker hub for this
Update: Okay, i managed to solve this. Apparently the path to be given in the config file isnt that of the local system but of the docker container. I replaced the path with the path to my docker container where the files are placed and it worked.
New Problem -1: (Now solved) Now i am getting an error as shown below. Highlighted in yellow
I had a look at peer-0-org-1-service node logs when this happened and this is the error it had logged.
2020-07-20 04:38:15.995 UTC [core.comm] ServerHandshake -> ERRO 028 TLS handshake failed with error tls: first record does not look like a TLS handshake server=PeerServer remoteaddress=172.18.0.53:33300
Update: Okay, i managed to solve this too. There were 2 issues. The TLS handshake wasn't happening because the TLS certificate wasn't set to true in the config. The second issue of STREAM removed happened because the url in the config wasnt specified as grpc. Once changes were done, it resolved
New Problem -2: (Current Issue)
It seems that the channel issue is there. Somehow it still shows "not assigned to this channel" and a new error of "Error: 14 UNAVAILABLE: failed to connect to all addresses". This same error happened for all the peers across 7 orgs.
And not to mention suddenly the peers are not able to talk to each other.
Error Received: Could not connect to Endpoint: peer0-org2-service:7051, InternalEndpoint: peer0-org2-service:7051, PKI-ID: , Metadata: : context deadline exceeded
I checked the peer channel connection details and everything seems to be in order. Stuck in this for now. Let me know if anyone has any ideas.
As you can see from the edits i got one problem solved before another came along. After banging my head for a lot of times, i removed the entire build, rebuilt it again with my corrections given above and it simply started working.
You seem to be using old Explorer image. I strongly recommend to use the latest one v1.1.1. Note: There are some updates of settings format in connection profile (e.g. login credential of Explorer). Please refer README-CONFIG for detail.

Error when trying to get token using Managed Service Identity in a multi-container azure web app service

We have the following scenario:
Current working setup
Web API project using a single DockerFile
A release pipe line with an 'Azure App Service deploy' task.
Proposed new setup
Web API project using multi container Docker Compose file
A release pipe line with an 'Azure Web App for Containers' task.
Upon deploying the new setup we receive the below error message:
ERROR - multi-container unit was not started successfully
Unhandled exception. System.AggregateException: One or more errors occurred.
(Parameters: Connection String: XXX, Resource: https://vault.azure.net, Authority:
https://login.windows.net/xxxxx. Exception Message:
Tried to get token using Managed Service Identity.
Access token could not be acquired. Connection refused)
The exception thrown is because it can't connect to Azure MSI (Managed Service Identity). It does this to obtain a token before connecting to key vault.
I have tried the following based upon some research and solutions others have found:
Connecting with "RunAs=App" (this seems to be the default parameter-less constructor anyway)
Building up the connection string myself manually by pulling the "MSI_SECRET" environment variable from the machine. This is always blank.
Restarting MSI.
Upgrading and downgrading AppAuthentication package
MSI appears to be configured correctly as it works perfectly with our current working setup so we can rule that out.
It's worth noting that this is System assigned identity not a user assigned one.
The documentation that states which services support managed identites only mentions 'Azure Container Instances' not 'Azure Managed Container Instances' and that is for Linux/Preview too so that it could be not supported.
Services that support managed identities for Azure resources
We've spent a considerable amount of time getting to this point with the configuration and deployment and it would be great if we could resolve this last issue.
Any help appreciated.
Unfortunately, there currently is no multi-container support for managed identities. The multi-container feature is in preview and so does not have all its functionality working yet.
However, the documentation you linked to is also not as clear about the supported scenarios, so I am working on getting this documentation updated to better clarify this. I can update this answer once that's done.

Getting Neo4J running on OpenShift

I am trying to get the Bitnami Neo4j image running on OpenShift (testing on my local Minishift), but I am unable to connect. I am following the steps outlined in this issue (now closed), however, now I cannot access the external IP for the load balancer.
Here are the steps I have taken:
Deploy Image (bitnami/neo4j)
Create service for the load balancer,
using the YAML supplied in the issue mentioned
Get the external IP
address for the LB (oc get services) The command in step 3 lists 2
of the same IP addresses, and when I attempt to go to this IP in my
browser it times out.
I can create a route that points to port 7374 on the IP of the LB, but
then I get the same error as reported in the aforementioned issue.
(ServiceUnavailable: WebSocket connection failure. Due to security
constraints in your web browser, the reason for the failure is not
available to this Neo4j Driver. Please use your browsers development
console to determine the root cause of the failure. Common)
Configure neo4j to accept non-local connections. E.g.:
dbms.connector.bolt.address=0.0.0.0:7687
Source: https://neo4j.com/developer/kb/explanation-of-error-websocket-connection-failure/

Jenkins service won't start unless it has access to 178.255.83.1

We recently went through some network policy updates and I've discovered that my Jenkins server's jenkins service will no longer restart as expected (this worked fine prior to the policy updates).
There doesn't seem to be any logging information written on the service startup (no log files get updates).
Is there a list of external IPs that Jenkins needs to access in order to start up properly?
By looking at the logs, it seems as though part of the service start-up process is to contact one of the OCSP Servers. This seems to be related to certificate verification so it's probably legitimate traffic.
Once an exception was added for the target address (http://178.255.83.1:80), the Jenkins service started up without issues.

Resources