How to set a global retry limit in sidekiq?

How to set a global retry limit in sidekiq? - ruby-on-rails

I would like to configure a global retry limit in Sidekiq to limit the number of retries. By default Sidekiq limits the number of retries to 25 but I want to set it lower for all Workers to prevent the long default maximum retry period if the limit is not explicitly specified on the Worker.

You can also configure in your sidekiq.yml
:max_retries: 10
:queues:
- queue_1
- queue_2
Refer doc here

Sidekiq.default_worker_options['retry'] = 10
https://github.com/mperham/sidekiq/wiki/Advanced-Options#workers

This value is being stored in options and (AFAIK) has no nifty setter for it, so here you go:
Sidekiq.options[:max_retries] = 5
It might be set for RetryJobs in the middleware initializer as well.

You can use Sidekiq.default_worker_options in your initializer. So to set a lower limit it'd be
Sidekiq.default_worker_options = { retry: 5 }

Currently working on setting this up to limit the amount of error noise created by our staging environments (for the sake of trying to stay well below our error handling service limits). It seems that the key is now max_retries when changing the amount, and retry is a boolean for whether it should retry at all or go right to the "Dead" queue.
https://github.com/mperham/sidekiq/wiki/Error-Handling#automatic-job-retry
This is what it looks for me in my Sidekiq config file:
if Rails.env.staging?
Sidekiq.default_worker_options['max_retries'] = 5
end
UPDATE: could have been my own confusion, but for some reason, default_worker_options did not seem to be working consistently for me. I ended up changing it to this and it worked as I hoped. Failed jobs went straight to the Dead queue:
Sidekiq.options[:max_retries] = 0

Related

How to configure to sidekiq queue to run only task at a time

I am using Sidekiq in my Rails application to process background jobs.
sidekiq.yml
:verbose: false
:pidfile: ./tmp/pids/sidekiq.pid
:logfile: ./log/sidekiq.log
:concurrency: 5
:queues:
- ["queue1", 1]
- ["queue2", 1]
- ["queue3", 1]
- ["queue4", 1]
- ["queue5", 1]
- critical
- default
- low
max_retries: 1
I want to run one only task per queue at a time.
For example: I am starting same worker 3 times in queue1, I want to process the worker1 and then worker2 and then worker3.
I have added "1" with queue name (e.g ["queue1", 1]) in sidekiq.yml. I thought this configuration will run only one worker in queue1. But that is not happening.
How to achieve the above configuration in sidekiq?

First a slight correction in the terminologies. "Workers" are constantly running threads that consume "jobs" from "queues". Hence you don't process worker1, instead worker1 will process jobs from queue1.
Now coming to the solution:-
It seems that your use case is that you don't want specific type of job (one that uploads to s3 in your case) to be executed on more than one worker at a given instant.
To achieve that, you will need a locking mechanism at job level. Sidekiq doesn't provide locking on its own, hence you need to use external gems for that.
I have used redis based 'sidekiq-lock' to achieve something similar in my projects.
Gem Link
Usage is as follows:-
class Worker
include Sidekiq::Worker
include Sidekiq::Lock::Worker
# static lock that expires after one second
sidekiq_options lock: { timeout: 1000, name: 'unique-lock-name' }
def perform
# your code to upload to s3
end
end
This will ensure that your code is running only one instance of given job on one worker at a time.

Sidekiq explicitly does not allow what you want to do.
https://github.com/mperham/sidekiq/wiki/Best-Practices#3-embrace-concurrency

Just use the built in ActiveJob because it sounds like you want FIFO queue. Sidekiq is for idempotent concurrent backgrounding

I think it that you want a feature
Add it to your worker class
class YourWorker
include Sidekiq::Worker
sidekiq_options queue: :queue1
But there are a caveat which worker only could queueing by queue1.

queues setting controls how often queue will be checked for jobs to control it's priority related to others, not limiting parallel jobs to that queue.
Also you can spin up multiple workers with limited number of threads (that's what sidekiq documentation suggests for this case), but this may not be practical for many queues with low load, when most of the workers will idle.
To limit particular queue parallel jobs within one worker - you can use gem sidekiq-limit_fetch and settings like:
:limits:
queue1: 1
queue2: 1

You can use sidekiq-limit_fetch to use advanced options for sidekiq. To limit concurrency on individual queue, you can do the following.
Step 1. Add gem 'sidekiq-limit_fetch' in your Gemfile
Step 2. Add limits for each queue in your sidekiq.yml
:limits:
queue1: 1
queue2: 1
queue3: 1
queue4: 1

Set your Sidekiq concurrency to 1. This way you'll have only 1 thread working in your process which would give you the desired output: only 1 job is done at a time.
Just change your configuration file:
:verbose: false
:pidfile: ./tmp/pids/sidekiq.pid
:logfile: ./log/sidekiq.log
:concurrency: 1 # Change Here
:queues:
- ["queue1", 1]
- ["queue2", 1]
- ["queue3", 1]
- ["queue4", 1]
- ["queue5", 1]
- critical
- default
- low
max_retries: 1

Autoscaling Resque workers on Heroku in real time

I would like to up/down-scale my dynos automatically dependings on the size of the pending list.
I heard about HireFire, but the scaling is only made every minutes, and I need it to be (almost) real time.
I would like to scale my dynos so that the pending list be ~always empty.
I was thinking about doing it by myself (with a scheduler (~15s delay) and using Heroku API), because I'm not sure there is anything out there; and if not, do you know any monitoring tools which could send an email alert if the queue lenght exceed a fixed size ? (similar to apdex on newrelic).

A potential custom code solution is included below. There are also two New Relic plgins that do Resque monitoring. I'm not sure if either do email alerts based on exceeding a certain queue size. Using resque hooks you could output log messages that could trigger email alerts (or slack, hipchat, pagerduty, etc) via a service like Papertrail or Loggly. THis might look something like:
def after_enqueue_pending_check(*args)
job_count = Resque.info[:pending].to_i
if job_count > PENDING_THRESHOLD
Rails.logger.warn('pending queue threshold exceeded')
end
end
Instead of logging you could send an email but without some sort of rate limiting on the emails you could easily get flooded if the pending queue grows rapidly.
I don't think there is a Heroku add-on or other service that can do the scaling in realtime. There is a gem that will do this using the deprecated Heroku API. You can do this using resque hooks and the Heroku platform-api. This untested example uses the heroku platform-api to scale the 'worker' dynos up and down. Just as an example I included 1 worker for every three pending jobs. The downscale will only every reset the workers to 1 if there are no pending jobs and no working jobs. This is not ideal and should be updated to fit your needs. See here for information about ensuring that then scaling down the workers you don't lose jobs: http://quickleft.com/blog/heroku-s-cedar-stack-will-kill-your-resque-workers
require 'platform-api'
def after_enqueue_upscale(*args)
heroku = PlatformAPI.connect_oauth('OAUTH_TOKEN')
worker_count = heroku.formation.info('app-name','worker')["quantity"]
job_count = Resque.info[:pending].to_i
# one worker for every 3 jobs (minimum of 1)
new_worker_count = ((job_count / 3) + 1).to_i
return if new_worker_count <= worker_count
heroku.formation.update('app-name', 'worker', {"quantity" => new_worker_count})
end
def after_perform_downscale
heroku = PlatformAPI.connect_oauth('OAUTH_TOKEN')
if Resque.info[:pending].to_i == 0 && Resque.info[:working].to_i == 0
heroku.formation.update('app-name', 'worker', {"quantity" => 1})
end
end

Im having a similiar issue and have ran into "Hirefire"
https://www.hirefire.io/.
For ruby, use:
https://github.com/hirefire/hirefire-resource
It runs similar to theoretically works like AdepScale (https://www.adeptscale.com/). However Hirefire can also scale workers and does not limit itself to just dynos. Hope this helps!

When to give up on getting results from an external web service?

I'm using a gem to get code results from Ideone.com. The gem submits code to Ideone and then checks for the results page. It checks timeout times and then gives up if there's no result. The problem is it might give up too early, but I also don't want it to wait too long if there's not going to be a result. Is there a way to know when one should give up hope?
This is the relevant code:
begin
sleep 3 if i > 0
res = JSON.load(
Net::HTTP.post_form(
URI.parse("http://ideone.com/ideone/Index/view/id/#{loc}/ajax/1"),
{}
).body
)
i += 1
end while res['status'] != '0' && i < timeout
if i == timeout
raise IdeoneError, "Timed out while waiting for code result."
end

Sounds like you want to adjust sleep timeout and number of attempts parameters. There is no absolute values suitable for each case, so you should pick some which are most appropriate for you application.
Unfortunatelly the gem code have both this parameters (3 seconds delay and 4 attempts) hardcoded so you don't have an elegant way to change them. So you can either fork the gem and change its code or try to monkey-patch the value of TIMEOUT constant with http://apidock.com/ruby/Module/const_set . However you won't be able to monkey-patch the delay between attempts value without rewriting method .run of the gem.
FYI. Net::HTTP has their own timeouts - how much time to wait for ideone.com connection and response. If they are exceeded Net::HTTP raises Timeout exception. The setters are
http://ruby-doc.org/stdlib-2.0/libdoc/net/http/rdoc/Net/HTTP.html#method-i-read_timeout-3D and #open_timeout=.

Net-ssh timeout for execution?

In my application I want to terminate the exec! command of my SSH connection after a specified amount of time.
I found the :timeout for the Net::SSH.start command but following the documentation this is only for the initial connection. Is there something equivalent for the exec command?
My first guess would be not using exec! as this will wait until the command is finished but using exec and surround the call with a loop that checks the execution status with every iteration and fails after the given amount of time.
Something like this, if I understood the documentation correctly:
server = NET::SSH.start(...)
server.exec("some command")
start_time = Time.now
terminate_calculation = false
trap("TIME") { terminate_calculation = ((Time.now - start_time) > 60) }
ssh.loop(0.1) { not terminate_calculation }
However this seems dirty to me. I expect something like server.exec("some command" { :timeout=>60}). Maybe there is some built in function for achieving this functionality?

I am not sure if this would actually work in a SSH context but Ruby itself has a timeout method:
server = NET::SSH.start ...
timeout 60 do
server.exec! "some command"
end
This would raise Timeout::Error after 60 seconds. Check out the docs.

I don't think there's a native way to do it in net/ssh. See the code, there's no additional parameter for that option.
One way would be to handle timeouts in the command you call - see this answer on Unix & Linux SE.
I think your way is better, as you don't introduce external dependencies in the systems you connect to.

Another solution is to set ConnectTimeout option in OpenSSH configuration files (~/.ssh/config, /etc/ssh_config, ...)
Check more info in
https://github.com/net-ssh/net-ssh/blob/master/lib/net/ssh/config.rb

what I did is have a thread that's doing the event handling. Then I loop for a defined number of seconds until channel closed.If after these seconds pass, the channel is still open, then close it and continue execution.

ActiveResource timeout not functioning [duplicate]

This question already has an answer here:
Overriding/Modifying Rails Class (ActiveResource)
(1 answer)
Closed 3 years ago.
I'm trying to contact a REST API using ActiveResource on Rails 2.3.2.
I'm attempting to use the timeout functionality so that if the resource I'm contacting is down I can fail quickly - I'm doing this with the following:
class WorkspaceResource < ActiveResource::Base
self.timeout = 5
self.site = "http://mysite.com/restAPI"
end
However, when I try to contact the service when I know it isn't available, the class only times out after the default 60 seconds. I can see from the error stack that the timeout error does indeed come from an ActiveResource class in my gem folder that has the proper functions to allow timeout settings, but my set timeout never seems to work.
Any thoughts?
So apparently the issue is not that timeout is not functioning. I can run a server locally, make it not return a response within the timeout limit, and see that timeout works.
The issue is in fact that if the server does not accept the connection, timeout does not function as I expected it to - it doesn't function at all. It appears as though timeout only works when the server accepts the connection but takes too long to respond.
To me, this seems like an issue - shouldn't timeout also work when the server I'm contacting is down? If not, there should be another mechanism to stop a bunch of requests from hanging...anyone know of a quick way to do this?

The problem
If you're running on Ruby 1.8.x then the problem is its lack of real system threads.
As you can read first hereand then here, there are systemic problems with timeouts in Ruby. An interesting discussion but for you in particular some comments suggest that the timeout is effectively ignored and defaults to 60 seconds - exactly what you are seeing.
Solutions ...
I have a similar issue with our own product when trying to send emails - if the email server is down the thread blocks. For me the solution was to spin the request off on a separate thread and therefore my main request-processing thread doesn't block.
There are non-blocking libraries out there for Ruby but perhaps you could take a look first at this System Timeout Gem.
An option open to anyone using Rails behind a proxy like nginx would be to set the upstream timeout to a lower number - that way you'll get notified if the server is taking too long. I'd only do this if I were really stuck for a solution.
Last but not least, it's possible that running Rails 2.3.2 on top of Ruby 1.9.1 will fix the issue.

Alternatively, you could try to catch these connection errors and retry once (after certain period of time) just to make sure the connection is really out.
retried = false
begin
#businesses = Business.find(:all, :params => { :shop_domain => #shop.domain })
retried = false
rescue ActiveResource::TimeoutError => ex
#raise ex
rescue ActiveResource::ConnectionError, ActiveResource::ServerError, ActiveResource::ClientError => ex
unless retried
sleep(((ex.respond_to?(:response) && ex.response['Retry-After']) || 5).to_i)
retried = true
retry
else
# raise ex
end
end
Inspired by this solution from Shopify for paginating a large number of records. https://ecommerce.shopify.com/c/shopify-apis-and-technology/t/paginate-api-results-113066

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

How to set a global retry limit in sidekiq? - ruby-on-rails

I would like to configure a global retry limit in Sidekiq to limit the number of retries. By default Sidekiq limits the number of retries to 25 but I want to set it lower for all Workers to prevent the long default maximum retry period if the limit is not explicitly specified on the Worker.

You can also configure in your sidekiq.yml :max_retries: 10 :queues: - queue_1 - queue_2 Refer doc here

Sidekiq.default_worker_options['retry'] = 10 https://github.com/mperham/sidekiq/wiki/Advanced-Options#workers

This value is being stored in options and (AFAIK) has no nifty setter for it, so here you go: Sidekiq.options[:max_retries] = 5 It might be set for RetryJobs in the middleware initializer as well.

You can use Sidekiq.default_worker_options in your initializer. So to set a lower limit it'd be Sidekiq.default_worker_options = { retry: 5 }

Related

How to configure to sidekiq queue to run only task at a time

Autoscaling Resque workers on Heroku in real time

When to give up on getting results from an external web service?

Net-ssh timeout for execution?

ActiveResource timeout not functioning [duplicate]

Categories

Resources