Rails - Old cron job keeps running, can't delete it - ruby-on-rails

So I'm using Rails and I have a few Sidekiq workers, but none are enabled. I'm using the sidekiq-cron gem, which requires you to put files in app/workers/, configure a sidekiq scheduler in config/sidekiq_schedule.yml, and also add a few lines in config/initializers/sidekiq.rb. However, I've commented everything out from sidekiq_schedule.yml and also commented the following lines out from sidekiq.rb:
# Sidekiq scheduler.
# schedule_file = 'config/sidekiq_schedule.yml'
# if File.exists?(schedule_file) && Sidekiq.server?
# Sidekiq::Cron::Job.load_from_hash! YAML.load_file(schedule_file)
# end
However, if I launch Sidekiq, every minute (which is the old schedule), I see this in the prompt:
2018-01-19T02:54:04.156Z 22197 TID-ovsidcme8 ActiveJob::QueueAdapters::SidekiqAdapter::JobWrapper JID-8609429b89db2a91793509ea INFO: start
2018-01-19T02:54:04.164Z 22197 TID-ovsidcme8 ActiveJob::QueueAdapters::SidekiqAdapter::JobWrapper JID-8609429b89db2a91793509ea INFO: fail: 0.008 sec
and it fails because it's trying to launch code a job that's not supposed to be launching.
I've went to the rails console prompt (rails -c) and tried to find the job, but nothing's in there:
irb(main):001:0> Sidekiq::Cron::Job.all
=> []
so I'm not quite sure why it's constantly trying to launch a job. If I go to the rails interface on my application, I don't see anything in the queue, nothing being processed, busy, retries, enqueued, nothing.
Any suggestions would be greatly appreciated. I've been trying to hunt this down for like the last hour and have no success. I even removed ALL of the workers from the workers directory, and yet it's still trying to launch one of them.

Because you have already load jobs, I think that those jobs configuration are still in REDIS. Checking this assumption by opening a new terminal tab with redis-cli:
KEYS '*cron*'
If there are those keys on REDIS, clear them will fix your issue.

Since you mentioned a cron job in your title but not in the question, I'm assuming there's a cronjob running the background sidekiq task.
Try running crontab - l in Terminal to see all your cron jobs. If you see something like "* * * * *", that means there's a job that is running every minute.
Then, use crontab - r to clear your cron tab and delete all scheduled tasks.

Related

Clear worker cache in delayed jobs in production

I am using delayed jobs in my rails application. it works fine but there is an issue occurred on production server. I created a class in lib and call its method from controller to generate a csv file through delayed jobs. It was working fine when I ran the delayed jobs on local and production server but then I made some changes to this class for file naming convention and restarted the delayed jobs on local and then on production server. Now when I call that method through delayed job then it works according to latest changes I made to the class and sometimes it uses the old logic of file naming convention.
What could be the issue?
Delayed job has a hidden "feature" which is to ignore any changes to your app, and just use old settings, env-variables, email-templates, etc. You can clear every cache and restart your server, and it still holds onto data which no longer exists anywhere in your app's codebase.
delayed_job - Performs not up to date code?
Also be aware that DJ's "restart" does not always kill and restart all the workers, so you need to hunt them down and kill them all manually with
ps aux | grep delay
See: Rails + Delayed Job => email view template does not get updated
I have not yet found a "clear delayed job cache" function. If it exists, someone please post it here.
In my case, I just spent almost 4 hours trying everything to delete failing delayed_jobs in Heroku. In case you get here trying to kill a zombie delayed_job, but you're in Heroku, this won't work.
You can not do ps aux like you'd do in a regular server, nor you can do rake jobs:clear, and if you check via Rails console, you'll see the jobs there, but not in the Database, so nothing you can do there either.
What I did was placing the app in maintenance mode, made a deployment totally uninstalling delayed_job gem and all its references, and then another deployment reverting that change. That cleared the zombie cache, and that did the trick.
I had a similar issue in Dokku. My solution was to remove the worker=1 entry from my DOKKU_SCALE file (so all it contained was web=1) and also to remove the worker: bundle exec rake jobs:work line from my Procfile.
I pushed that to my production server, reversed the changes above, pushed again and it was fixed.

delayed_job -i via cron script through ruby will not start after stopping previous processes

So I have a weird situation, I have delayed_job 2.0.7 and daemons 1.0.10 and ruby 1.87 & rails 2.3.5 running on Scientific Linux release 6.3 (Carbon).
I have a rake task that restarts delayed jobs each night and then does a bunch of batch processing. I used to just do ruby script/delayed_job stop and then start. I have added a backport of named queues that has allowed me to do named queues. So because of this I want to start several processes of each type of named queue. To do this, it seems the best way I found is to use -i to name each process differently so they don't collide.
I wrote some ruby code to do this looping and it works great in dev, it works great on the command line, it works great when called from the rails console. But when called from cron it fails silently, the call returns false but no error message.
# this works
system_call_result1 = %x{ruby script/delayed_job stop}
SnapUtils.log_to_both "result of stop all - #{system_call_result1} ***"
# this works
system_call_result2 = %x{mv log/delayed_job.log log/delayed_job.log.#{Date.today.to_s}}
SnapUtils.log_to_both "dj log file has been rotated"
# this fails, result is empty string, if I use system I get false returned
for x in 1..DELAYED_JOB_MAX_THREAD_COUNT
system_call_result = %x{ruby script/delayed_job -i default-#{x} start}
SnapUtils.log_to_both "result of start default queue iteration #{x} - #{system_call_result} ***"
end
# this fails the same way
for y in 1..FOLLOWERS_DELAYED_JOB_MAX_THREAD_COUNT
system_call_result = %x{ruby script/delayed_job --queue=follower_ids -i follower_ids-#{y} start}
SnapUtils.log_to_both "result of start followers queue iteration #{y} - #{system_call_result} ***"
end
So I did lots of trial and found that this problem only happens if I used -i - named processes and only happens if I stop them, then try to start them. If I remove the stops then everything works fine.
Again this is only when I use cron.
If I use command line or console to run, it works fine.
So my question is, what could cron be doing differently that causes these named dj processes not to start if you previously stopped them in the same ruby process?
thanks
Joel
Ok, I finally figured this out, when checking to see if cron would send email, we found that sendmail was broken, the version of mysql that sendmail wanted was not installed, so we fixed that and then our problem magically went away. I would still offer the bounty to anyone that can explain exactly why..

"Whenever" gem running cron jobs on Heroku

I created an app that uses the whenever gem. The gem creates cron jobs. I got it working locally but can't seem to get it working on heroku cedar. What's the command to do this?
running:
heroku run whenever --update-crontab job1
doesn't work
Short answer: use the scheduler add-on: http://addons.heroku.com/scheduler
Long answer: When you do heroku run, we
spin up a dyno
put your code on it
execute your command, wait for it to finish
throw the dyno away
Any changes you made to crontab would be immediately thrown away. Everything is ephemeral, you cannot edit files on heroku, just push new code.
You need to add Heroku Scheduler addon.
You can add it directly from your dashboard or using following commands:
install the add-on:
heroku addons:create scheduler:standard
Create a rake task in lib/tasks
# lib/tasks/scheduler.rake
task :send_reminders => :environment do
User.send_reminders
end
Schedule job
Visit Heroku Dashboard
Open your app
Select Scheduler from add-ons list
Click Add Job, enter a task and select frequency.
e.g. Add rake send_reminders, select "Daily" and "00:00" to send reminders every day at midnight.
The other answers specify you should use the Heroku Scheduler add-on, and it is able to run a background tasks indeed, but it doesn't support the flexibility of cron.
There's another add-on, called Cron To Go, that is able to run your jobs on one-off dynos with cron's flexibility. You can also specify a timezone for your job and get notifications (email or webhook) when job fail, succeed or start.
(Full disclosure - I work for the company that created and operates Cron To Go)
If you want to:
Use Heroku Scheduler
Run tasks every minute (not 10 min)
Don't care about dyno hours
This was my solution hack to run jobs every minute - assuming the task completes in under 60 seconds.
task start_my_service: :environment do
1.upto(9) do |iteration|
start_time = DateTime.now
Services::MyService.call
end_time = DateTime.now
wait_time = 60 - ((end_time - start_time) * 24 * 60 * 60).to_i
sleep wait_time if wait_time > 0
end
end
Heroku doesn't support cron jobs. And there are two drawbacks to the Heroku Scheduler :
you cannot choose an arbitrary interval or time at which to run jobs (it's either every 10 mins, 1 hour or daily).
your jobs are not defined in code, hence not in your versioning system and not easy to keep track of or modify.
Heroku does provide an alternative : custom clock processes. But the clock process requires its own dyno, and "Since dynos are restarted at least once a day some logic will need to exist on startup of the clock process to ensure that a job interval wasn’t skipped during the dyno restart".
Simple scheduler is a gem made specifically made for scheduling on Heroku, but seems a bit hackish.
I ended up using sidekiq-cron. Only drawback : if sidekiq is down right when a job is scheduled to run, the job won't run.

How do I clear stuck/stale Resque workers?

As you can see from the attached image, I've got a couple of workers that seem to be stuck. Those processes shouldn't take longer than a couple of seconds.
I'm not sure why they won't clear or how to manually remove them.
I'm on Heroku using Resque with Redis-to-Go and HireFire to automatically scale workers.
None of these solutions worked for me, I would still see this in redis-web:
0 out of 10 Workers Working
Finally, this worked for me to clear all the workers:
Resque.workers.each {|w| w.unregister_worker}
In your console:
queue_name = "process_numbers"
Resque.redis.del "queue:#{queue_name}"
Otherwise you can try to fake them as being done to remove them, with:
Resque::Worker.working.each {|w| w.done_working}
EDIT
A lot of people have been upvoting this answer and I feel that it's important that people try hagope's solution which unregisters workers off a queue, whereas the above code deletes queues. If you're happy to fake them, then cool.
You probably have the resque gem installed, so you can open the console and get current workers
Resque.workers
It returns a list of workers
#=> [#<Worker infusion.local:40194-0:JAVA_DYNAMIC_QUEUES,index_migrator,converter,extractor>]
pick the worker and prune_dead_workers, for example the first one
Resque.workers.first.prune_dead_workers
Adding to answer by hagope, I wanted to be able to only unregister workers that had been running for a certain amount of time. The code below will only unregister workers running for over 300 seconds (5 minutes).
Resque.workers.each {|w| w.unregister_worker if w.processing['run_at'] && Time.now - w.processing['run_at'].to_time > 300}
I have an ongoing collection of Resque related Rake tasks that I have also added this to: https://gist.github.com/ewherrmann/8809350
Run this command wherever you ran the command to start the server
$ ps -e -o pid,command | grep [r]esque
you should see something like this:
92102 resque: Processing ProcessNumbers since 1253142769
Make note of the PID (process id) in my example it is 92102
Then you can quit the process 1 of 2 ways.
Gracefully use QUIT 92102
Forcefully use TERM 92102
* I'm not sure of the syntax it's either QUIT 92102 or QUIT -92102
Let me know if you have any trouble.
I just did:
% rails c production
irb(main):001:0>Resque.workers
Got the list of workers.
irb(main):002:0>Resque.remove_worker(Resque.workers[n].id)
... where n is the zero based index of the unwanted worker.
I had a similar problem that Redis saved the DB to disk that included invalid (non running) workers. Each time Redis/resque was started they appeared.
Fix this using:
Resque::Worker.working.each {|w| w.done_working}
Resque.redis.save # Save the DB to disk without ANY workers
Make sure you restart Redis and your Resque workers.
Started working on https://github.com/shaiguitar/resque_stuck_queue/ recently. It's not a solution to how to fix stuck workers but it addresses the issue of resque hanging/being stuck, so I figured it could be helpful for people on this thread. From README:
"If resque doesn't run jobs within a certain timeframe, it will trigger a pre-defined handler of your choice. You can use this to send an email, pager duty, add more resque workers, restart resque, send you a txt...whatever suits you."
Been used in production and works pretty well for me thus far.
Here's how you can purge them from Redis by hostname. This happens to me when I decommission a server and workers do not exit gracefully.
Resque.workers.each { |w| w.unregister_worker if w.id.start_with?(hostname) }
I ran into this issue and started down the path of implementing a lot of the suggestions here. However, I discovered the root cause that was creating this issue was that I was using the gem redis-rb 3.3.0. Downgrading to redis-rb 3.2.2 prevented these workers from getting stuck in the first place.
I've cleared them out from redis-cli directly. Luckily redistogo.com allows access from environments outside heroku.
Get dead worker ID from the list. Mine was
55ba6f3b-9287-4f81-987a-4e8ae7f51210:2
Run this command in redis directly.
del "resque:worker:55ba6f3b-9287-4f81-987a-4e8ae7f51210:2:*"
You can monitor redis db to see what it's doing behind the scenes.
redis xxx.redistogo.com> MONITOR
OK
1380274567.540613 "MONITOR"
1380274568.345198 "incrby" "resque:stat:processed" "1"
1380274568.346898 "incrby" "resque:stat:processed:c65c8e2b-555a-4a57-aaa6-477b27d6452d:2:*" "1"
1380274568.346920 "del" "resque:worker:c65c8e2b-555a-4a57-aaa6-477b27d6452d:2:*"
1380274568.348803 "smembers" "resque:queues"
Second last line deletes the worker.
In resque 2.0.0, here's one way that seems to work to remove only actually appearantly-dead workers in resque 2.0.0:
Resque::Worker.all_workers_with_expired_heartbeats.each { |w| w.unregister_worker }
I am not an expert in what's going, it's possible there's a better way to do this or that this will have problems. I'm just trying to figure this out too.
This seems to remove workers that haven't sent a "heartbeat" in much longer than expected from the resque worker list.
If the phantom worker was in a "running" state, then a new entry in the "failed" job queue will be created corresponding to phantom job.
I had stuck/stale resque workers here too, or should I say 'jobs', because the worker is actually still there and running fine, it's the forked process that is stuck.
I chose the brutal solution of killing the forked process "Processing" since more than 5min, via a bash script, then the worker just spawn the next in queue, and everything keeps on going
have a look at my script here: https://gist.github.com/jobwat/5712437
If you are using newer versions of Resque, you'll need to use the following command as the internal APIs have changed...
Resque::WorkerRegistry.working.each {|work| Resque::WorkerRegistry.remove(work.id)}
This avoids the problem as long as you have a resque version newer than 1.26.0:
resque: env QUEUE=foo TERM_CHILD=1 bundle exec rake resque:work
Keep in mind that it does not let the currently running job finish.
If you use Docker, you can also use this command:
<id> is the worker id.
docker stop <id>
docker start <id>

Getting delayed_job to just work

I followed the railscast which uses CollectiveIdea's fork. I'm not able to get it to work. I created a new file in my /lib folder and included this
class Device
def deliver
#my long running method
end
handle_asynchronously :deliver
end
device = Device.new
device.deliver
I do a script/delayed_job and that forks an app instance. Now,
There's no job activity going on. Nothing in the delayed_jobs table and nothing in the logs. Am I missing something here?
How do I set the interval for which the method should be run? (Ex every 30 seconds)
I'm testing this in the development mode (Rails 2.3.2), and soon will be moving this into production.
Thanks !
Do you see a process for the script/delayed_job that you ran? Do a ps aux | grep delayed_job and see if there is a process running.
AFAIK, you cannot set any time intervals using Delayed Job.
As a first step to diagnose the problem:
Stop your job workers
Launch a delayed job
Check whether it is present in the database.

Resources