When a job in Sidekiq fails, it goes into retry queue and it is retried for 25 times, as per https://github.com/mperham/sidekiq/wiki/Error-Handling#automatic-job-retry. So the question is, Is there any way to find whether the job that is currently getting executed, is running for first-time or it is the n'th retry of that job?
Note this job is running separately in a worker.
P.s: I'm new to sidekiq, workers and async jobs, so pardon if the question is not clear or an obvious one.
Related
So on my jenkins sometimes my worker "slave02" gets offline and needs to manually get unstuck. I will not get into details, because it's not the point of this question here.
The scenario so far:
I've configured a job intentionally to get processed on that exact worker. But obviously it would not start since the worker is offline. I want to get notified when that job gets stuck in queue. I've tried to use Build Timeout Jenkins Plugin and I've configured it to fail the build if it waits for longer than 5 minutes to complete the job.
The problem with this is that the plugin makes sure the job fails 5 minutes after the build gets started... which does not help in my case. Because the job doesn't start, rather it sits in queue waiting to get processed but that never happens. So my question is - is there a way to make the job check if that worker is down to just automatically fail the build and send notification?
I am pretty sure that can be done but I could not find a thread where this type of scenario is being discussed.
I new to ROR. Wanted to ask something for confirmation. If I run long schedule job. Will it block others schedule job? I have others job running every 5 minutes, Plan to write something that easily run more than 3 hours. Will it block the 5 minutes job?
The whenever gem is basically only a way to configure and handle Cron jobs.
That said: At the given time Cron will just start and run a configured job. Cron will not block other jobs nor it cares if a job fails or if another job is still running.
Limiting factor might be:
Memory/CPU consumption: Each job consumes memory/CPU. If there are too many jobs running at the same time your server might run out of memory or might have a high load. But this doesn't really block other jobs it just slows down the whole server.
Database locks: If your jobs perform tasks that lock database tables other queries might be blocked and need to wait. But this is not Cron specific, this depends on what your code actually does.
My environment: CentOS, Rails 4, Ruby 2.
I'm running in Delayed Job infinite loop that getting in real time information from some another site.
The task is to run simultaniously many processes to get information from different sites. For each new process I add it to DJ queue by running search_engine.delay.track!.
So when I run one worker it successfully takes first job from queue and when job complete takes next. When I run more than one worker by ./bin/delayed_job -n5 start every DJ worker takes first job in queue and starts to track! it. But I want them to take only jobs, that was not been taked by other workers.
It was because of Delayed::Worker.max_run_time = 0. The DJ thought, that work already done and runs it again.
Is it possible that 1 job is being processed twice by 2 different sidekiq threads? I am using sidekiq to insert some analytics events into a mongodb collection, asynchronously. I see around 15 duplicates in that collection. My guess is that 2 worker threads picked the same job, at the same time, and added to the collection.
Does sidekiq ensure that the job is picked only by 1 thread. We can ignore the restart case, as the jobs are small and will complete in less than 8s.
Is firing analytics events asynchronously using sidekiq not a good practice? What are my options? I could add a unique key to the event and check it before insert to avoid insertion of duplicates, but that's adding data (+ an overhead/query) that I am never going to use (and it adds up for millions of events). Can I somehow ensure that a job is processed only once by sidekiq?
Thanks for your help.
No. Sidekiq uses Redis as a work queue for background processing. Redis provides atomic operations for adding jobs to the queue and popping jobs off of the queue (specifically the redis BRPOP command). Each Sidekiq worker tries to fetch a job from the queue with a timeout via BRPOP and any given job popped from the queue will only be returned to one of the workers pulling work from the queue.
What is more likely is that you are enqueuing multiple jobs.
Another possibility is that your job is throwing an error, causing it to partially execute, and then be re-tried multiple times. By default Sidekiq will retry failed jobs, but doesn't have any built in mechanism for transactions/atomicity of work. ie: If your sidekiq job does A, B, and C and doing B raises an exception, causing the job to fail - it will be retried, causing A to be run again each time the job is retried.
At least once a day my Delayed::Job workers will randomly stop working jobs off the queue, yet the processes are still alive.
Pictured: "Zombies"
When I inspect the remaining jobs in the queue, none will show that they are locked/being-worked by the zombified workers in question. Even when looking at failed jobs its hard to make a definite question connection between a failure and the workers going into zombie mode.
I have a theory that a job has an error that causes workers to segfault, but not completely die. Is there any way to inspect a worker process and see what it's doing? How would one go about debugging this issue when there's not even a stacktrace or failed job to inspect?