Getting delayed_job to just work - ruby-on-rails

I followed the railscast which uses CollectiveIdea's fork. I'm not able to get it to work. I created a new file in my /lib folder and included this
class Device
def deliver
#my long running method
end
handle_asynchronously :deliver
end
device = Device.new
device.deliver
I do a script/delayed_job and that forks an app instance. Now,
There's no job activity going on. Nothing in the delayed_jobs table and nothing in the logs. Am I missing something here?
How do I set the interval for which the method should be run? (Ex every 30 seconds)
I'm testing this in the development mode (Rails 2.3.2), and soon will be moving this into production.
Thanks !

Do you see a process for the script/delayed_job that you ran? Do a ps aux | grep delayed_job and see if there is a process running.
AFAIK, you cannot set any time intervals using Delayed Job.

As a first step to diagnose the problem:
Stop your job workers
Launch a delayed job
Check whether it is present in the database.

Related

Rails - Old cron job keeps running, can't delete it

So I'm using Rails and I have a few Sidekiq workers, but none are enabled. I'm using the sidekiq-cron gem, which requires you to put files in app/workers/, configure a sidekiq scheduler in config/sidekiq_schedule.yml, and also add a few lines in config/initializers/sidekiq.rb. However, I've commented everything out from sidekiq_schedule.yml and also commented the following lines out from sidekiq.rb:
# Sidekiq scheduler.
# schedule_file = 'config/sidekiq_schedule.yml'
# if File.exists?(schedule_file) && Sidekiq.server?
# Sidekiq::Cron::Job.load_from_hash! YAML.load_file(schedule_file)
# end
However, if I launch Sidekiq, every minute (which is the old schedule), I see this in the prompt:
2018-01-19T02:54:04.156Z 22197 TID-ovsidcme8 ActiveJob::QueueAdapters::SidekiqAdapter::JobWrapper JID-8609429b89db2a91793509ea INFO: start
2018-01-19T02:54:04.164Z 22197 TID-ovsidcme8 ActiveJob::QueueAdapters::SidekiqAdapter::JobWrapper JID-8609429b89db2a91793509ea INFO: fail: 0.008 sec
and it fails because it's trying to launch code a job that's not supposed to be launching.
I've went to the rails console prompt (rails -c) and tried to find the job, but nothing's in there:
irb(main):001:0> Sidekiq::Cron::Job.all
=> []
so I'm not quite sure why it's constantly trying to launch a job. If I go to the rails interface on my application, I don't see anything in the queue, nothing being processed, busy, retries, enqueued, nothing.
Any suggestions would be greatly appreciated. I've been trying to hunt this down for like the last hour and have no success. I even removed ALL of the workers from the workers directory, and yet it's still trying to launch one of them.
Because you have already load jobs, I think that those jobs configuration are still in REDIS. Checking this assumption by opening a new terminal tab with redis-cli:
KEYS '*cron*'
If there are those keys on REDIS, clear them will fix your issue.
Since you mentioned a cron job in your title but not in the question, I'm assuming there's a cronjob running the background sidekiq task.
Try running crontab - l in Terminal to see all your cron jobs. If you see something like "* * * * *", that means there's a job that is running every minute.
Then, use crontab - r to clear your cron tab and delete all scheduled tasks.

Clear worker cache in delayed jobs in production

I am using delayed jobs in my rails application. it works fine but there is an issue occurred on production server. I created a class in lib and call its method from controller to generate a csv file through delayed jobs. It was working fine when I ran the delayed jobs on local and production server but then I made some changes to this class for file naming convention and restarted the delayed jobs on local and then on production server. Now when I call that method through delayed job then it works according to latest changes I made to the class and sometimes it uses the old logic of file naming convention.
What could be the issue?
Delayed job has a hidden "feature" which is to ignore any changes to your app, and just use old settings, env-variables, email-templates, etc. You can clear every cache and restart your server, and it still holds onto data which no longer exists anywhere in your app's codebase.
delayed_job - Performs not up to date code?
Also be aware that DJ's "restart" does not always kill and restart all the workers, so you need to hunt them down and kill them all manually with
ps aux | grep delay
See: Rails + Delayed Job => email view template does not get updated
I have not yet found a "clear delayed job cache" function. If it exists, someone please post it here.
In my case, I just spent almost 4 hours trying everything to delete failing delayed_jobs in Heroku. In case you get here trying to kill a zombie delayed_job, but you're in Heroku, this won't work.
You can not do ps aux like you'd do in a regular server, nor you can do rake jobs:clear, and if you check via Rails console, you'll see the jobs there, but not in the Database, so nothing you can do there either.
What I did was placing the app in maintenance mode, made a deployment totally uninstalling delayed_job gem and all its references, and then another deployment reverting that change. That cleared the zombie cache, and that did the trick.
I had a similar issue in Dokku. My solution was to remove the worker=1 entry from my DOKKU_SCALE file (so all it contained was web=1) and also to remove the worker: bundle exec rake jobs:work line from my Procfile.
I pushed that to my production server, reversed the changes above, pushed again and it was fixed.

Heroku Scheduler not creating log

I recently set up the Scheduler add on and set up my rake task, 'rake cron_jobs:my_task'.
When I test it with
'heroku run rake cron_jobs:my_task', it works fine.
The scheduler also claims it ran when it was supposed to, and is scheduled to run again, but there's no logging associated with the process the way https://devcenter.heroku.com/articles/scheduler#inspecting-output says there should be.
'heroku ps' shows no scheduled dynos, 'heroku logs --ps scheduler.1' has no output.
What am I missing?
Actually I was trying to solve this myself, and did not find the answer anywhere, so here it is if someone else is struggling with this:
heroku logs --tail --ps scheduler
--tail is important to keep streaming the logs.
My best guess: the heroku ps and heroku logs commands only give you status logs for currently running processes/dynos.
So after the scheduled rake task is done, you can't reach the logs through the command line.
You can access the history of your logs by using one of the logging addons. Most of them offer a free tier too.
They all are based on the log drains which you also could use directly, if you want to build it yourself.
Here is what I do for that:
Simply in your tasks itself include put statements to know when the job started running and when it is finished as well.
Also, you can include puts statement in the executed job as well.
I'm using paper trial add-on which is a very powerful logging tool that you can search and find any particular log at a specific time. Also, you can add an alert when your schedule job started to run.
I had a similar problem (using the newer Heroku PGBackups Service) and found an unexpected explanation for it.
The rake task rake pgbackups-archive was not run by Heroku Scheduler, but it worked when I ran it manually from the command line.
In my case, I noticed that my issues were caused by the different time zone used by the Heroku interface (which seems not to be CET). So my rake task which should have run at a specific time daily effectively never ran, as I changed the specific time throughout the day for testing and I always missed the specified time in the Heroku timezone.
You can try running the task every ten minutes and see if it works.

Rake Task Starts But Stops abruptly when executed via controller

I have a set of rake tasks that run on the production server, its detached from the main thread, and happens in the background
here is the code to execute it
def vehicle
#estate = Estate.find(#estate_id)
#date_string = #login_month.strftime("%m%Y")
system("rake udpms:process_only_vehicle[#{#date_string},#{#estate_id}] &")
redirect_to :controller => "reports/error_messages", :message => "Processing will happen in the background and reports will be refreshed after two minutes", :target => "_blank"
end
when this code is executed via the url route, it runs the rake task, i can see if i check the active processes on the production machine, but it ends abruptly after about 10 seconds.
ps axl | grep rake
this is the it shows
ruby /usr/local/rvm/gems/ruby-1.8.7-p352/bin/rake udpms:process_only_vehicle[082012,5]
if i execute the same same rake task in the app folder in the terminal it runs with out any errors. This runs without any issues on the dev machine. (OSX). Server is Mint. Rake version is the same on both. there is only one version of the gem.
since its the production server there are no logs (other than the produciton.log, and its no help). any help on how i go about debugging this issue will be much appreciated.
This is probably happening because your server software reaps requests that take longer than 10 seconds to respond. Despite the fact you're kicking off a rake task, it still has to wait for that system call to execute: if it takes awhile then the task will be terminated and the server worker returned to the worker pool.
In a more general sense, this is not the appropriate way to make a task happen in the background. You probably want to use Resque or Delayed Job, which enqueue tasks and run them in the background for you.

How do I clear stuck/stale Resque workers?

As you can see from the attached image, I've got a couple of workers that seem to be stuck. Those processes shouldn't take longer than a couple of seconds.
I'm not sure why they won't clear or how to manually remove them.
I'm on Heroku using Resque with Redis-to-Go and HireFire to automatically scale workers.
None of these solutions worked for me, I would still see this in redis-web:
0 out of 10 Workers Working
Finally, this worked for me to clear all the workers:
Resque.workers.each {|w| w.unregister_worker}
In your console:
queue_name = "process_numbers"
Resque.redis.del "queue:#{queue_name}"
Otherwise you can try to fake them as being done to remove them, with:
Resque::Worker.working.each {|w| w.done_working}
EDIT
A lot of people have been upvoting this answer and I feel that it's important that people try hagope's solution which unregisters workers off a queue, whereas the above code deletes queues. If you're happy to fake them, then cool.
You probably have the resque gem installed, so you can open the console and get current workers
Resque.workers
It returns a list of workers
#=> [#<Worker infusion.local:40194-0:JAVA_DYNAMIC_QUEUES,index_migrator,converter,extractor>]
pick the worker and prune_dead_workers, for example the first one
Resque.workers.first.prune_dead_workers
Adding to answer by hagope, I wanted to be able to only unregister workers that had been running for a certain amount of time. The code below will only unregister workers running for over 300 seconds (5 minutes).
Resque.workers.each {|w| w.unregister_worker if w.processing['run_at'] && Time.now - w.processing['run_at'].to_time > 300}
I have an ongoing collection of Resque related Rake tasks that I have also added this to: https://gist.github.com/ewherrmann/8809350
Run this command wherever you ran the command to start the server
$ ps -e -o pid,command | grep [r]esque
you should see something like this:
92102 resque: Processing ProcessNumbers since 1253142769
Make note of the PID (process id) in my example it is 92102
Then you can quit the process 1 of 2 ways.
Gracefully use QUIT 92102
Forcefully use TERM 92102
* I'm not sure of the syntax it's either QUIT 92102 or QUIT -92102
Let me know if you have any trouble.
I just did:
% rails c production
irb(main):001:0>Resque.workers
Got the list of workers.
irb(main):002:0>Resque.remove_worker(Resque.workers[n].id)
... where n is the zero based index of the unwanted worker.
I had a similar problem that Redis saved the DB to disk that included invalid (non running) workers. Each time Redis/resque was started they appeared.
Fix this using:
Resque::Worker.working.each {|w| w.done_working}
Resque.redis.save # Save the DB to disk without ANY workers
Make sure you restart Redis and your Resque workers.
Started working on https://github.com/shaiguitar/resque_stuck_queue/ recently. It's not a solution to how to fix stuck workers but it addresses the issue of resque hanging/being stuck, so I figured it could be helpful for people on this thread. From README:
"If resque doesn't run jobs within a certain timeframe, it will trigger a pre-defined handler of your choice. You can use this to send an email, pager duty, add more resque workers, restart resque, send you a txt...whatever suits you."
Been used in production and works pretty well for me thus far.
Here's how you can purge them from Redis by hostname. This happens to me when I decommission a server and workers do not exit gracefully.
Resque.workers.each { |w| w.unregister_worker if w.id.start_with?(hostname) }
I ran into this issue and started down the path of implementing a lot of the suggestions here. However, I discovered the root cause that was creating this issue was that I was using the gem redis-rb 3.3.0. Downgrading to redis-rb 3.2.2 prevented these workers from getting stuck in the first place.
I've cleared them out from redis-cli directly. Luckily redistogo.com allows access from environments outside heroku.
Get dead worker ID from the list. Mine was
55ba6f3b-9287-4f81-987a-4e8ae7f51210:2
Run this command in redis directly.
del "resque:worker:55ba6f3b-9287-4f81-987a-4e8ae7f51210:2:*"
You can monitor redis db to see what it's doing behind the scenes.
redis xxx.redistogo.com> MONITOR
OK
1380274567.540613 "MONITOR"
1380274568.345198 "incrby" "resque:stat:processed" "1"
1380274568.346898 "incrby" "resque:stat:processed:c65c8e2b-555a-4a57-aaa6-477b27d6452d:2:*" "1"
1380274568.346920 "del" "resque:worker:c65c8e2b-555a-4a57-aaa6-477b27d6452d:2:*"
1380274568.348803 "smembers" "resque:queues"
Second last line deletes the worker.
In resque 2.0.0, here's one way that seems to work to remove only actually appearantly-dead workers in resque 2.0.0:
Resque::Worker.all_workers_with_expired_heartbeats.each { |w| w.unregister_worker }
I am not an expert in what's going, it's possible there's a better way to do this or that this will have problems. I'm just trying to figure this out too.
This seems to remove workers that haven't sent a "heartbeat" in much longer than expected from the resque worker list.
If the phantom worker was in a "running" state, then a new entry in the "failed" job queue will be created corresponding to phantom job.
I had stuck/stale resque workers here too, or should I say 'jobs', because the worker is actually still there and running fine, it's the forked process that is stuck.
I chose the brutal solution of killing the forked process "Processing" since more than 5min, via a bash script, then the worker just spawn the next in queue, and everything keeps on going
have a look at my script here: https://gist.github.com/jobwat/5712437
If you are using newer versions of Resque, you'll need to use the following command as the internal APIs have changed...
Resque::WorkerRegistry.working.each {|work| Resque::WorkerRegistry.remove(work.id)}
This avoids the problem as long as you have a resque version newer than 1.26.0:
resque: env QUEUE=foo TERM_CHILD=1 bundle exec rake resque:work
Keep in mind that it does not let the currently running job finish.
If you use Docker, you can also use this command:
<id> is the worker id.
docker stop <id>
docker start <id>

Resources