I have a Resque job which pulls a csv list of data off of a remote server and then runs through the +40k entries to add any new items to an existing database table. The job is running fine however it severely slows down the response time of any subsequent requests to the server. In the console which I've launched 'bundle exec rails server', I see not print statements though the job is running. However once I hit my rails server (via a page referesh), I see multiple SELECT / INSERT statements roll by before the server responds. The SELECT/INSERT statements are clearly generated by my Resque job but oddly they wait to print to the console unit I hit the server through the browser.
It sure feels like I'm doing something wrong or not following the 'rails way'. Advice?
Here is the code in my Resque job which does the SELECT/INSERTS
# data is an array of hashes formed from parsing csv input. Max size is 1000
ActiveRecord::Base.transaction do
data.each do |h|
MyModel.find_or_create_by_X_and_Y( h[:x], h[:y], h )
end
end
Software Stack
Rails 3.2.0
postgresql 9.1
Resque 1.20.0
EDIT
I've finally take the time to debug this a bit more. Even a very simple worker, like below, slows down the next server response. In the console where I've launched the rail sever process I see that the delay occurs b/c stdout from the worker is being printed only after I ping the server.
def perform()
s = Time.now
0.upto( 90000 ) do |i|
Rails.logger.debug i * i
end
e = Time.now
Rails.logger.info "Start: #{s} ---- End #{e}"
Rails.logger.info "Total Time: #{e - s }"
end
I can get the rails server back to its normal responsiveness again if I suppress stdout when I launch rails but it doesn't seem like that should be necessary... bundle exec rails server > /dev/nul
Any input on a better way to solve this issue?
I think this answer to "Logging issues with Resque" will help.
The Rails server, in development mode, has the log file open. My understanding -- I need to confirm this -- is that it flushes the log before writing anything new to it, in order to preserve the order. If you have the Rails server attached to a terminal, it wants to output all of the changes first! This can lead to large delays if your workers have written large quantities to the log.
Note: this has been happening to me for some time, but I just put my finger on it recently.
Related
I have an API which uses a Service, in which I have used Ruby thread to reduce the response time of the API. I have tried to share the context using the following example. It was working fine with Rails 4, ruby 2.2.1
Now, we have upgraded rails to 5.2.3 and ruby 2.6.5. After which service has stopped working. I can call the service from Console, it works fine. But with API call, service becomes unresponsive once it reaches CurrencyConverter.new. Any Idea what can be the issue?
class ParallelTest
def initialize
puts "Initialized"
end
def perform
# Our sample set of currencies
currencies = ['ARS','AUD','CAD','CNY','DEM','EUR','GBP','HKD','ILS','INR','USD','XAG','XAU']
# Create an array to keep track of threads
threads = []
currencies.each do |currency|
# Keep track of the child processes as you spawn them
threads << Thread.new do
puts currency
CurrencyConverter.new(currency).print
end
end
# Join on the child processes to allow them to finish
threads.each do |thread|
thread.join
end
{ success: true }
end
end
class CurrencyConverter
def initialize(params)
#curr = params
end
def print
puts #curr
end
end
If I remove the CurrencyConverter.new(currency), then everything works fine. CurrencyConverter is a service object that I have.
Found the Issue
Thanks to #anothermh for this link
https://guides.rubyonrails.org/threading_and_code_execution.html#wrapping-application-code
https://guides.rubyonrails.org/threading_and_code_execution.html#load-interlock
As per the blog, When one thread is performing an autoload by evaluating the class definition from the appropriate file, it is important no other thread encounters a reference to the partially-defined constant.
Only one thread may load or unload at a time, and to do either, it must wait until no other threads are running application code. If a thread is waiting to perform a load, it doesn't prevent other threads from loading (in fact, they'll cooperate, and each perform their queued load in turn, before all resuming running together).
This can be resolved by permitting concurrent loads.
https://guides.rubyonrails.org/threading_and_code_execution.html#permit-concurrent-loads
Rails.application.executor.wrap do
urls.each do |currency|
threads << Thread.new do
CurrencyConverter.new(currency)
puts currency
end
ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
threads.map(&:join)
end
end
end
Thank you everybody for your time, I appreciate.
Don't re-invent the wheel and use Sidekiq instead. 😉
From the project's page:
Simple, efficient background processing for Ruby.
Sidekiq uses threads to handle many jobs at the same time in the same process. It does not require Rails but will integrate tightly with Rails to make background processing dead simple.
With 400+ contributors, and 10k+ starts on Github, they have build a solid parallel job execution process that is production ready, and easy to setup.
Have a look at their Getting Started to see it by yourself.
I am using Unicorn as my app server for my Rails app, and am trying to figure out why there sometimes is sometimes a non-trivial (> 5 seconds) delay between the start of a request, and when it reaches my controller.
This is what my production.log prints out:
Started GET "/search/articles.json?q=mashable.com" for 138.7.7.33 at 2015-07-23 14:59:19 -0400**
Parameters: {"q"=>"mashable.com"}
Searching articles for keyword: mashable.com, format: json, Time: 2015-07-23 14:59:26 -0400
Notice how there is a 7 second delay in between STARTED GET: and "Searching articles for keyword", which is the first thing the controller method does.
articles.json is routed to my controller method "articles" which simply does this for now:
def articles
format = params[:format]
keyword = params["q"]
Rails.logger.info "Searching articles for keyword: #{keyword}, format: #{format}, Time: #{Time.now.to_s}"
end
This is my routes.rb
MyApp::Application.routes.draw do
match '/search/articles' => 'search#articles'
#more routes here, but articles is the first route
end
What could possibly cause this delay? Is it because an Unicorn worker is busy? Is it because an Unicorn worker is taking up too much memory which leads the system to be slow?
Note: I don't believe the delay is in making any database connections but I could be wrong. The code doesn't need to make a database call, and the max connections for my database is 1000, and there are usually at most 1-2 connections.
Three thoughts:
You'll probably be better served using Puma instead of Unicorn
It could be that your system is running out of memory, or it could have plenty of memory available: install New Relic to troubleshoot where the bottleneck is
It could also be that you have more Unicorn instances than the number of connections your DB allows, in which case the instance is having to wait for others to disconnect before it can connect. This would likely manifest itself with irregular 5-second delays rather than happening every time.
Actually, it might be caused by an before_filter callback, you should check it
I think it can be because of lack of memory and thus frequent garbage collection, which freeze whole system.
If it's a production problem it could be caused by slow clients sending requests. New Relic and Monit are good options. You could consider sending signals to Unicorn workers to restart them to better understand the problem.
You could also try adding preload_app true in your Unicorn config to speed up the startup time of worker processes.
It is instead taking up my processor, and then effectually timing out.
I have in my controller :
after_save :handle_file
def handle_test
Resque.enqueue UnpackFileOnS3, parent.id
end
It hits this mark, and then the entire app waits for it to set up and upload the files as prescribed inside my Job. Then it predictably times out because it takes awhile to upload it.
This occurs in my console as well.. If I run :
Resque.enqueue UnpackFileOnS3, 4
Then instead of enqueue'ing it, it locks up my console as it tries to run the entire file. I think that normally, console would just enqueue it to a worker and redis..
Why isn't this actually happening in the background? As I assume if that were the case, the timeouts would not occur.
My guess is that you are running resque in an inline mode. In this mode queing is disabled. Check your configs for this kind of code:
Resque.inline = ENV['RAILS_ENV'] == "cucumber"
#or whatever, important part is the inline option
I have a Rails app that sends multiple requests both sequentially and in parallel to a third-party API and do calculation in the backend.
I would like to know how long each of my API requests and calculation takes. Is there performance testing gem I should use?
Note: my app uses Sidekiq to process backend jobs.
http://guides.rubyonrails.org/performance_testing.html might get you started, check out section 3 for details of wrapping methods in "benchmark" which outputs some useful stats to the log.
As a quick example:
def process
Benchmark.bm do |x|
x.report("Processing Task") do
process_task(task_options)
end
end
end
would output something like:
user system total real
Processing Task 8.206000 1.092000 9.298000 ( 14.609000)
Here's the code for the scraper
class Scrape
def perform
url = "# a long url"
agent = Mechanize.new
agent.get(url)
while(agent.page.link_with(:text => "Next Page \u00BB")) do
agent.page.search(".content").each do |item|
puts "."
House.create!({
# attributes...
})
end
agent.page.link_with(:text => "Next Page \u00BB").click
end
end
end
On my local environment I can run it in the rails console just by typing
Scrape.new.delay.perform # to queue the job
rake jobs:work
and it works perfectly.
However running the analogous (with a worker running instead of rake jobs:work) in the Heroku console doesn't seem to do anything. I tried logging some lines in the Heroku log and I can get the url variable to log (so the method is at least getting called) but the "." which is there to show each time we run the while loop never appears and no Houses are created in the database.
Anyone any ideas what might be wrong?
Solved this problem myself, pretty obscure bug though. I was using ruby 1.9.2 in my local environment but I had the app deployed on a ruby 1.8.7 stack.
The important difference being the change in character encoding between the two ruby versions which meant that Mechanize couldn't find a link with the unicode encoded character "\u00BB" and thus didn't do any scraping.