delayed_job dies with out error -- leaving job in locked state - ruby-on-rails

After DJ dies the log files indicate nothing.
running: ./script/delayed_job status
gives: pid-file for killed process 1143 found (/appPath/tmp/pids/delayed_job.pid), deleting.
delayed_job: no instances running
The strange thing is if I use: ./script/delayed_job run It will run perfectly in the foreground! And never dies.
Tried many versions of delayed_job and mongoid with same results.
Any one know how to debug?
Using:
rails (3.2.7)
delayed_job_mongoid (2.0.0)
mongoid (3.0.3)
delayed_job (3.0.3)

Turns out delayed_job was executing a job causing a segmentation fault, which would kill the delayed_job daemon.
After debugging it turns out Random.rand() will cause a reproducible segmentation fault when run in a daemonized environment. This has to do with initial seeding and setup of the random generator, which apparently does not get handled properly by daemonize.
The solution: Random.new.rand()

I'm wondering if the weird behaviour in this stack overflow DJ log question could account for the behaviour you had. The answer looks plausible too. Stranger things have happened.
Pt 2:
Permission issues? could may very well be mucking it up too. Is this in production or dev? Does it work in Dev?
PT 3: From the github page of DJm Make sure you are using MongoDB version 1.3 or newer. Are you?
pt 4: and this? script/rails runner 'Delayed::Backend::Mongoid::Job.create_indexes'
Lastly, as of today DJM is running red on Travis, with some errors that may effect you. I had a shoddy build in a gem once, driving me to drink only to be fixed 2 days later. http://travis-ci.org/#!/collectiveidea/delayed_job_mongoid/jobs/1962498
If that isn't it, throw on pry in the Gemfile, add binding.pry to that script starting at the top and working down.

Related

how to debug to find the code that is blocking the main thread

I have a Rails app running version 6.1.1 and currently, we use Ruby 2.7.2
While trying to upgrade to Ruby 3, I'm facing an interesting issue: some code apparently is blocking the main thread. When I try to boot the server or run the tests, the console stuck and I can't even stop the process, I have to kill it.
I tracked it down to one gem called valvat, used to validate EU VAT numbers. I opened an issue on its Github repo but the maintainer couldn't reproduce even using the same Gemfile.lock I have, which lead me to believe that it might not be just the gem, gotta be something else in my code.
This is what happens when I try to boot the server:
=> Booting Puma
=> Rails 6.1.1 application starting in development
=> Run `bin/rails server --help` for more startup options
^C^C^C
as one can see, I can't even stop it, the thread is now hanging and I can't tell exactly by whom.
I tried to run the specs with -b -w to see what I can see but got the same error: the thread hangs and the warnings I get from Ruby are just generic ones like method already defined or something like that.
This is the last output from the console while running specs with -b -w before the thread hangs:
/Users/luiz/rails_dev/app/config/initializers/logging_rails.rb:18: warning: method redefined; discarding old tag_logger
/Users/luiz/.rbenv/versions/3.0.0/lib/ruby/gems/3.0.0/gems/activejob-6.1.1/lib/active_job/logging.rb:19: warning: previous definition of tag_logger was here
thing is, I also get these warnings when I remove the gem and run this command though the specs run without issues then.
Is there a way to track this down to whatever is causing the thread to hang?
If you have no error message, it's hard to understand where exactly your application hangs.
As the developer cannot reproduce with your Gemfile.lock, it's possible that one of your config file or initializers is the culprit. You could create a new blank application with the same Gemfile, add your config and initializers files one by one, and test each time if the server runs until you find which one is causing the freeze.
Test also your application with another Ruby version (are you using rbenv or RVM?)
Also check your syslog, as valvat calls a web service you may find connection errors there. Check this doc on how to access your syslog: https://www.howtogeek.com/356942/how-to-view-the-system-log-on-a-mac/

Why does Rails console create so many Ruby processes?

I experimented with a Rake task with Cron. I started with no Ruby processes, then the Cron job started and spawned one process. The highlighted process below is what is run by Cron, which is expected:
I wanted to check if any records were being written to the database. I ran rails c to enter the Rails console, and noticed that suddenly four other ruby processes showed up in my process list as above. Why would this happen? I think that running the console should create one other process and not four.
After quitting the console, I am left with three Ruby processes including the Rake task that is running.
I am using Rails 4.2.
It's not that I find this to be problematic, but I am curious why there would need to be more than one process for a REPL and then two leftover processes after the REPL is closed.
This is because of spring which has shipped with rails by default for a little while now.
You might notice that the second time you run rails c is a lot faster than the first time. This is because the first time you run a springified script your app is loaded as normal and then forks to run what you requested. The second time around this loader script can just fork a second time, so you get a much faster startup. This possible because of these extra processes you have noticed.
You can see them by running
spring status
And you can kill them by running
spring stop

Rspec hangs after it finishes running through my tests

Has anyone seen this problem before? Sometimes when running specs for my Rails 3.2.14 project rspec seems to finish as usual:
Finished in 1.27 seconds
6 examples, 2 failures
Failed examples:
rspec ./spec/models/my_spec.rb:123 # Hello world 1
rspec ./spec/models/my_spec.rb:234 # Hello world 2
but then it just hangs there and won't let me continue working in that shell. I can kill -9 the process from another terminal tab, or just start a new shell and run the tests again there, but it makes test driven development a huge pain.
When I restart my computer, the problem goes away for a while, but it always happens again eventually. After it hangs once, it keeps hanging every time I run rspec, even if I run different tests in a different project. The same tests in the same projects pass just fine on my coworkers' computers every time.
I'm not sure what information would help to answer this question so let me know if there is something I should add to this post. I'm running ruby 2.0.0p195 and rails 3.2.14. I've got Mac OS 10.7.5. I use zsh and rbenv.
Thanks for reading!
I had the same issue and solved it by adding the following to my spec_helper.rb:
SimpleCov.start do
use_merging false
end
Note that I'm running on a virtual box vm with a synched directory.
Here's the explanation I found as to why this works:
"When simplecov exists, by default it will try to merge the recorded coverage with what's on disk. To avoid corruption it uses a lock file to guard this merge. Because my virtual box shared fs is not actually posix compliant, the lock file would never acquire, and silently block forever here.
Since I don't care about merging coverage results, the solution for me was to simply disable this behavior with the use_merging flag."
https://gist.github.com/k-yamada/3930916
Ok, figured it out with a friend's help. All it took was removing SimpleCov from my spec_helper. Not sure why SimpleCov was causing the issue, but I'll post an update here if I find out.

what causes 'deadlock; recursive locking' error in a Rails app?

My rails app tracks any delayed_job errors, and we saw this one today for the first time:
deadlock; recursive locking /app/vendor/bundle/ruby/1.9.1/gems/delayed_job-3.0.5/lib/delayed/worker.r
The app has been performing flawlessly, with millions of delayed jobs handled w/o error.
Is this just "one of those random things" or is there something different we can/should do to prevent it from happening again?
I'm especially confused because we run only a single worker.
Our setup: Rails 3.2.12, Heroku app, Postgres, several web dynos but only 1 worker dyno.
This is an issue with Rack. See similar bug reports:
https://github.com/rack/rack/issues/658
https://github.com/rack/rack/issues/349
I had the same issue. The fix was to upgrade rubygems. The way I used to upgrade:
gem update --system
Ref: https://github.com/pry/pry/issues/2137#issuecomment-720775183

Thinking Sphinx delta indexing fails in production

Here's what I've determined:
Delta indexing works fine in development
Delta indexing does not work when I push to the production server, and no action is logged in searchd.log
I'm running Phusion Passenger, and, as recommended in the basic troubleshooting guide, have confirmed that:
www-data has permission to run indexing rake tasks (ran them from command line manually)
the path to indexer and searchd are correct (/usr/local/bin)
there are no errors in production.log
What on earth could I possibly be missing? I'm running Ruby Enterprise 1.8.6, Rails 2.3.4, Sphinx 0.9.8.1, and Thinking Sphinx 1.2.11.
Thanks!
Last night as I slept it hit me. Unsurprisingly, it was a stupid issue involving bad configuration, though I am rather surprised that it produced the results it did. I guess I don't know much about Thinking Sphinx internals.
Recently I migrated servers. sphinx.yml looked like this:
production:
bin_path: '/usr/local/bin'
host: mysql.mysite.com
On the new server, MySQL was just a local service, but I had forgotten to remove that line. Interestingly, manual rake reindexing still worked just fine. I'm intrigued that Thinking Sphinx didn't throw an error when trying to reload the deltas, since mysql.mysite.com no longer exists, even though that was clearly the source of the issue.
Thanks for all your help, and sorry to have brought up such a silly problem.
Are there any clues in Apache/Nginx's error log?
Here's the next troubleshooting step I would take. Open up the file for the delta indexing strategy you are using (presumably lib/thinking_sphinx/deltas/default_delta.rb). Find the line where it actually generates the indexing command. In mine (v1.1.6) it's line 20:
output = `#{config.bin_path}indexer --config #{config.config_file} #{rotate} #{delta_index_name model}`
Change that so you can log the command itself, and maybe log the output as well:
command = `#{config.bin_path}indexer --config #{config.config_file} #{rotate} #{delta_index_name model}`
RAILS_DEFAULT_LOGGER.info(command)
output = `#{command}`
RAILS_DEFAULT_LOGGER.info(output)
Deploy that to production and tail the log while modifying a delta-indexed model. Hopefully that will actually show you the problem. Of course maybe the problem is elsewhere in the code and you won't even get to this point, but this is where I would start.
I was having this problem and found the "bin_path" solution mentioned above. When it didn't seem to work, it took me a while to realize that I'd pasted in the example code for "production" when I was testing on "stagiing" environment. Problem solved!
This was after making sure that the rake tasks that config, index, and start sphinx are all running as the same user as your passenger instance. If you log into the server as root to run these tasks, they will work in the console but not via passenger.
I had the same problem. Works on the command line, not inthe app.
Turns out that we still had a slave database that we were using for the indexing, but the slave wasn't getting updated.
As above, same issues were faced our side on two machine. The first one we had an issue with mysql which showed in apache2 log. Only seemed to affect local OSX machine..
Second time when we deployed to Ubuntu server, we had same issue. Rails c production was fine, no errors, bla bla bla.
Ended up being a permissions problem. Couldn't figure this out as there were no problems starting, although I guess I was doing so as root.
Using capistrano and passenger, we did this:
Create a passenger user and added to www-data group
Changed user in deploy.rb to be passenger
Manually changed all the /current files to be owned by above
Logged in as passenger user.
Ran rake ts:rebuild RAILS_ENV="production"
Worked a treat for us...
Good luck

Resources