Getting 503 Error on Google Dataflow - UNAVAILABLE - google-cloud-dataflow

I'm trying to run apache beam python pipeline on dataflow but immediately (10-15sec) after launching the job, it gets failed status.
The error on logs:
Failed to start the VM, launcher-2021030302333314603154945777358700, used for launching because of status code: UNAVAILABLE, reason: One or more operations had an error: 'operation-1614767615027-5bc9f6216a93c-7b50752f-842a8707': [UNAVAILABLE] 'HTTP_503'..
The error message gets cut short so I cannot dig into deeper. I believe I added all relevant permissions etc. but cant get it to work. Initial research suggests that it could be a backend issue or permissions issue?
In addition the same pipeline has worked in other projects.
Appreciate if someone can help me debug and fix.

It is because of the region. I move the region from 'asia-southeast2' to 'us-central1' and it worked.

Related

HTTP Error 500.37 - ANCM Failed to Start Within Startup Time Limit aspnetcore6.0

I am facing below issue while switch VS2019 to VS2022. I am not able to run my API project its throwing below error. I Search on google and look many articals but not got success. Please suggest to resolve this problem. I checked and found that .netCore upper version not support to old version code.
HTTP Error 500.37 - ANCM Failed to Start Within Startup Time Limit
Common solutions to this issue:
ANCM failed to start after -1 milliseconds
Troubleshooting steps:
Check the system event log for error messages
Enable logging the application process' stdout messages
Attach a debugger to the application process and inspect

GCP Dataflow warning message RMI TCP "java.net.SocketTimeoutException: Accept timed out

I am running apache beam java pipeline and for some reason getting lots of warning logs in GCP.
I tried changing log level of packages java.net,sun.rmi to SEVERE but still no success.
Logs are getting polluted with these warning messages. Any one else facing the same issue ?
jsonPayload: {
exception: "java.net.SocketTimeoutException: Accept timed out
at java.base/java.net.PlainSocketImpl.socketAccept(Native Method)
at java.base/java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:458)
at java.base/java.net.ServerSocket.implAccept(ServerSocket.java:551)
at java.base/java.net.ServerSocket.accept(ServerSocket.java:519)
at java.rmi/sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:394)
at java.rmi/sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:366)
at java.base/java.lang.Thread.run(Thread.java:834)
"
logger: "sun.rmi.transport.tcp"
message: "RMI TCP Accept-5555: accept loop for ServerSocket[addr=0.0.0.0/0.0.0.0,localport=5555] throws"
Pipeline is simple : Pubsub to Postgres. No additional third party connectivity.
Please refer to public documentation about troubleshooting.
Select the job to view more detailed information on errors and run results. When you select a job, you can view the execution graph as well as some information about the job. Then, click the Logs button to view log messages generated by your pipeline code and the Dataflow service.
Another thing, that you can use is debug option. When running the gcloud command, you can include the option --verbosity=debug to get debugging output.
This might be related to a JVM bug. Please check Java SDK version snd upgrade to a newer (2.17.0 or higher) version.
Additionally, check Encoding errors, IOExceptions, or unexpected behavior in user code error.
I hope you find the above pieces of information useful.
I couldn't figure out the actual issue but in the mean time since it was polluting the logs traces added flag in pipeline options:
--workerLogLevelOverrides={"sun.rmi.transport.tcp":"OFF"}

Jenkins doesn't update GitHub check status sometimes

I'm using Jenkins 2.15 (GitHub plugin 1.29.3) based CI for my GitHub core repo. It works fine, but sometimes Jenkins build doesn't update GitHub check status.
I see nothing relevant into Jenkins log.
Any idea how to debug and hopefully fix this issue?
As I know, check status update is just an http request to the status api: https://developer.github.com/v3/repos/statuses/
I experienced a similar behavior with a database. The client application and the database had no errors. Each one was on a different host.
What I did was, create a bash script in host A to perform a ping to host B.
ping www.host_B.com | while read pong; do echo "$(date): $pong"; done >> /tmp/ping-test-$(date +%F).log
Then, when the sporadic error related to the connection of the database occurred, the log file helped me to detect that the error was related to:
Network issues
Latency issues
Internet service provider issues
In your case, you could perform a simple curl to the status api and compare to the sporadic behavior detected.

Intermittent CLI API 500 when Downloading File

I'm using a gdrive CLI command on a Jenkins web node to automatically download a file from Google Drive during a build process.
This use to work, however, recently (As of a week or two ago) the command to download the file intermittently started producing 500 errors with no message.
The command that's being run is: gdrive download query "name = '16.7.3.zip'".
Sometimes the above command downloads the file, but, sometimes it doesn't. Here's an example of the error output:
Is anyone able to give advice on where to start with this issue, is it something on Google's side?
I've read a few articles explaining that this might be throttling from the API, however, I'd have expected a different error code, i.e. 403 with the error "The download quota for this file has been exceeded.".
I have the following specs installed:
gdrive: 2.1.0
Golang: go1.6
OS/Arch: linux/amd64
Intermittent 500 errors are allowed for in the Google Drive API. You simply need to do an exponential backoff and retry. Generally they are caused by internal timeouts within the Drive infrastructure. Sometimes these are related to service problems, other times they can be caused by a request that causes a large amount of work.

Issue Launching Grails project to cloudfoundry

I am having a problem launching my (grails) project to cloud foundry. I have already launched with cf-push, but I keep getting this error
I/O error: Connection reset; nested exception is java.net.SocketException: Connection reset
when I run cf-update.
I also cannot see my log files with cf-crashlogs. I get this in the terminal window:
grails> cf-crashlogs
| Checking for available resources:.....
And if I try to access the page I get a 404 Not Found page.
Did I completely miss something? has anyone else seen this or know how fix this issue?
please check which version of the cf grails plugin were you using. try listing the plugin updates with this command:
grails list-plugin-updates
after that try to get cloud foundry connection info by:
grails cf-info
i suppose you know how to configure the login info, all the configure properties are listed here: http://grails-plugins.github.com/grails-cloud-foundry/docs/manual/guide/3%20Configuration.html
to access your app log, the most commonly used command is
grails cf-logs [destination] [--appname] [--instance] [--stderr] [--stdout] [--startup]
hope that helps.
I was trying to test Cloud Foundry long time ago. Don't remember but also had some issues which I couldn't overcome using default tool.
However then I used the Cloud Foundry Integration.
As I mentioned it was some time ago, so I won't help with the details, but the plugin worked as expected and I was able to deploy. Maybe you will success with it too :)

Resources