Job taking forever to cancel - google-cloud-dataflow

I have had this issue two times these days when I try to cancel a Dataflow job : it is taking forever to cancel.
last thursday it took almost 9 hours to cancel and now another job is stuck for 2 minutes.
Is it not possible to kill a job directly ?
What can explain such a behaviour ?

This is a new issue having to do with the size of the job and ensuring clean shutdown. We're investigating a few approaches to make shutdown faster.

Related

How can I schedule a Jenkins job to be started only when it is not already running?

The cron part is easy and covered elsewhere. What I have is a job that runs mostly for six hours but sometimes only two or three. If I schedule the job for every six hours and it only runs for two, I waste four hours waiting for the next one. If I schedule it every hour, then when it runs six hours, five jobs are going to stack up waiting their turn.
Rigging something in the job itself to figure out if it already running is sub optimal. There are four machines involved so I can't examine process tables. I would need a semaphore on disk or in some other shared resource, which involves making sure it is cleared not only when the job finishes, but when it dies.
I could also query Jenkins to see if a job is already running, but that's also more code I need to write and get right.
Is there a way to tell Jenkins directly: Schedule only if you are not already running?
Maybe disableConcurrentBuilds could help? https://www.jenkins.io/doc/book/pipeline/syntax/
And you can also clean the queue if needed

Slow and weird draining process on Dataflow job

I have a Dataflow job in streaming mode that is written in Apache Beam Python SDK and overall it works as expected, but, since from time to time we want to update the code, we need to deploy a new job and drain the old one. As soon as I press the "Drain" button, the job starts to act weird. It increases the number of workers to maximum allowed, which is expected, I know, but the CPU usage of those workers is crazy. Also, the "System Latency" metric is increasing. You can see in the screenshot below where I set it to drain, it's very visible.
You can see that the draining didn't finish for like 4 days, I had to cancel the job.
One thing I don't understand is why the "Throughput" of the DoFns is 0 all the time while the job is draining.
How can I change my job to make it more draining friendly?
You can find the code of the job in https://gist.github.com/PlugaruT/bd8ac4c007acbff5097cea33ccc1bbb1

PC wakes up automatically only for a few minutes(I need more time)

I set up Jenkins to run some jobs(It runs 10 minutes) at night. In that time PC is already went sleep. Thats why i set up Task Scheduler:
Triggered every days. Actions cmd.exe/c"exit". This think wake up PC only for two minutes, but i need 10(and more in the future). Do you mind helping me solve this problem?

All background jobs failing with max concurrent job limit reached - Parse

I have a background job that runs every minute and it has been working fine for the last few weeks. All of a sudden when I log in today, every job is failing instantly with max concurrent job limit reached. I have tried deleting the job and waiting for 15 minutes so any job running currently can finish but when I schedule the job again it just starts failing every time like before. I don't understand why parse thinks I am running a job when I am not.
Someone deleted my previous answer (not sure why). This is and continues to be a Parse Bug. See this Google Group for other people reporting the issue (https://groups.google.com/forum/#!msg/parse-developers/_TwlzCSosgk/KbDLHSRmBQAJ) there are several open Parse bugs about this as well.

How to run jobs longer than 24 hours on Heroku?

How can we run background jobs that take longer than 24 hours on Heroku? Since every dyno is killed once a day it seems impossible, even if I have a dedicated worker dyno to handle it. Is writing my job in a way that it'll continue from where it stopped when was killed the only way?
Thanks,
Michal
Ideally you want to break that long job up into a number of small jobs that can be handled independently. What exactly are you doing that takes more than 24 hours?

Resources