I am using DelayedJob for a long running task in my app. I have the job defined in a class MyJob saved in app/jobs/my_job.rb.
All was well, but I added some code to the file and restarted the server and the changes are not up. Before it was saving one field and now it should be saving two, and I added a logger.debug line to help me, but nothing is coming up in the logs and the models aren't being saved with the field.
This is in 'production' (though still using Thin webserver).
development works.
Since the folder is in the autoload path (at least if I am not misunderstood) I didn't think I needed to do anything special. But since it is not working, something must be off. Help? Let me know if you need more info.
Well, turns out I had a worker who was off the radar.
I run my server in a docker container (should have mentioned that) so even though I did a
RAILS_ENV=production script/delayed_job restart
in both the container and the host (the container has the app from the host as a volume), a worker I probably started some other time continued on. I saw it in the logs when I went back, so I did a
kill {pid}
and that solved my problem. So Flavio was right, I just had to kill the worker myself because the script wasn't picking it up.
Related
Have got new project with production server already running on passenger.
After doing normal deploy as described in 1 page docs left by old team (only cap prod deploy) we are having constant problems. First of all - code we pushed doesn't seem to be working. It should be adding new data on rake tasks to items in database. Physically code is there - in current folder. But it doesnt seems to be triggered.
I ve noticed that there were no rails or gem installed when I ve tried to do simple rails c via ssh. After installing everything and manually launching code with binding.pry added looks like code was triggered. But via normal scheduled rake task it was not.
It looks like passenger is running in daemon since there are no pid in tmp folder (usual for rails app).
1) Is there a chance restarting server will actually help - it was not restarted after deploy and I have no idea how to restart it without pid.
2)passenger-config restart-app returns actually 2 servers. Can they collide and prevent normal app work?(Update: servers are not same - single letter difference in the name)
Still passenger-config restart-app dont seem to restart server
Sorry for the wall of text by I ve spent 2 nights getting it to work and I still lost.
I am using delayed jobs in my rails application. it works fine but there is an issue occurred on production server. I created a class in lib and call its method from controller to generate a csv file through delayed jobs. It was working fine when I ran the delayed jobs on local and production server but then I made some changes to this class for file naming convention and restarted the delayed jobs on local and then on production server. Now when I call that method through delayed job then it works according to latest changes I made to the class and sometimes it uses the old logic of file naming convention.
What could be the issue?
Delayed job has a hidden "feature" which is to ignore any changes to your app, and just use old settings, env-variables, email-templates, etc. You can clear every cache and restart your server, and it still holds onto data which no longer exists anywhere in your app's codebase.
delayed_job - Performs not up to date code?
Also be aware that DJ's "restart" does not always kill and restart all the workers, so you need to hunt them down and kill them all manually with
ps aux | grep delay
See: Rails + Delayed Job => email view template does not get updated
I have not yet found a "clear delayed job cache" function. If it exists, someone please post it here.
In my case, I just spent almost 4 hours trying everything to delete failing delayed_jobs in Heroku. In case you get here trying to kill a zombie delayed_job, but you're in Heroku, this won't work.
You can not do ps aux like you'd do in a regular server, nor you can do rake jobs:clear, and if you check via Rails console, you'll see the jobs there, but not in the Database, so nothing you can do there either.
What I did was placing the app in maintenance mode, made a deployment totally uninstalling delayed_job gem and all its references, and then another deployment reverting that change. That cleared the zombie cache, and that did the trick.
I had a similar issue in Dokku. My solution was to remove the worker=1 entry from my DOKKU_SCALE file (so all it contained was web=1) and also to remove the worker: bundle exec rake jobs:work line from my Procfile.
I pushed that to my production server, reversed the changes above, pushed again and it was fixed.
I'm developing this application which I have deployed to OpenShift.
I have "moved" the actual user registration process to a delayed job because there are a lot of stuff taking placing during this. Every two days (or so), the delayed job process stops running.
In the logs I find this:
Error while reserving job: closed MySQL connection
I tried starting it with the following command:
RAILS_ENV=production bin/delayed_job -m start
but the problem still exists.
Any ideas?
Try adding this to your database.yml
reconnect: true
I am not sure if this will fix your problem, but its worth trying.
Also, have a look at this MySql documentation about lost connection
Just had this problem (not using OpenShift). After I tried the command you said, I still had a problem. Then I restarted delayed_job like so:
RAILS_ENV=production bin/delayed_job stop
RAILS_ENV=production bin/delayed_job start
and the problem went away. In my case the problem was that delayed_job was looking for a method that no longer exists and simply needed to be restarted. Maybe this helps.
I also tried Vimsha's answer on development and not on production and it didn't effect the result for me.
I have two jobs that are queued simulataneously and one worker runs them in succession. Both jobs copy some files from the builds/ directory in the root of my Rails project and place them into a temporary folder.
The first job always succeeds, never have a problem - it doesn't matter which job runs first either. The first one will work.
The second one receives this error when trying to copy the files:
No such file or directory - /Users/apps/Sites/my-site/releases/20130829065128/builds/foo
That releases folder is two weeks old and should not still be on the server. It is empty, housing only a public/uploads directory and nothing else. I have killed all of my workers and restarted them multiple times, and have redeployed the Rails app multiple times. When I delete that releases directory, it makes it again.
I don't know what to do at this point. Why would this worker always create/look in this old releases directory? Why would only the second worker do this? I am getting the path by using:
Rails.root.join('builds') - Rails.root is apparently a 2 week old capistrano release? I should also mention this only happens in the production environment. What can I do
?
Rescue is not being restarted (stopped and started) on deployments which is causing old versions of the code to be run. Each worker continues to service the queue resulting in strange errors or behaviors.
Based on the path name it looks like you are using Capistrano for deploying.
Are you using the capistrano-resque gem? If not, you should give that a look.
I had exactly the same problem and here is how I solved it:
In my case the problem was how capistrano is handling the PID-files, which specify which workers currently exist. These files are normally stored in tmp/pids/. You need to tell capistrano NOT to store them in each release folder, but in shared/tmp/pids/. Otherwise resque does not know which workers are currently running, after you make a new deployment. It looks into the new release's pids-folder and finds no file. Therefore it assumes that no workers exist, which need to be shut down. Resque just creates new workers. And all the other workers still exist, but you cannot see them in the Resque-Dashboard. You can only see them, if you check the processes on the server.
Here is what you need to do:
Add the following lines in your deploy.rb (btw, I am using Capistrano 3.5)
append :linked_dirs, ".bundle", "tmp/pids"
set :resque_pid_path, -> { File.join(shared_path, 'tmp', 'pids') }
On the server, run htop in the terminal to start htop and then press T, to see all the processes which are currently running. It is easy to spot all those resque-worker-processes. You can also see the release-folder's name attached to them.
You need to kill all worker-processes by hand. Get out of htop and type the following command to kill all resque-processes (I like to have it completely clean):
sudo kill -9 `ps aux | grep [r]esque | grep -v grep | cut -c 10-16`
Now you can make a new deploy. You also need to start the resque-scheduler again.
I hope that helps.
I have a Rails app installed on a Slicehost server running Apache 2 and Ubuntu LTC 10.04. Things have worked beautifully up until now: I edit a file, do a quick mongrel_rails cluster::restart, and the changes are reflected in production. However, suddenly this process has broken down.
For example, I have a class called Master located in /lib/master.rb. I added a new method to this class that simply runs puts "it works!", then restarted the mongrel cluster. Looking at the production logs, the server throws an error and thinks this method doesn't exist. When I go to the console using ruby script/console production, however, I can use this new method perfectly. I even tried deleting the file containing entire Master class. Once again, the production thought it was still there, but the production console correctly recognized it was missing.
Any ideas? How can the production environment detect a class that doesn't even exist anymore?
Funny, I spend 2 hours debugging this, then post to StackOverflow and figure it out in 20 minutes.
The problem is that I needed to also restart my background jobs as well. They were running the old version of the classes stored in /lib. It's interesting that this problem has never snagged me before.