How can I tell Sentry not to alert certain exceptions? - ruby-on-rails

I have a Rails 5 application using raven-ruby to send exceptions to Sentry which then sends alerts to our Slack.
Raven.configure do |config|
config.dsn = ENV['SENTRY_DSN']
config.environments = %w[ production development ]
config.excluded_exceptions += []
config.async = lambda { |event|
SentryWorker.perform_async(event.to_hash)
}
end
class SentryWorker < ApplicationWorker
sidekiq_options queue: :default
def perform(event)
Raven.send_event(event)
end
end
It's normal for our Sidekiq jobs to throw exceptions and be retried. These are mostly intermittent API errors and timeouts which clear up on their own in a few minutes. Sentry is dutifully sending these false alarms to our Slack.
I've already added the retry_count to the jobs. How can I prevent Sentry from sending exceptions with a retry_count < N to Slack while still alerting for other exceptions? An example that should not be alerted will have extra context like this:
sidekiq: {
context: Job raised exception,
job: {
args: [{...}],
class: SomeWorker,
created_at: 1540590745.3296254,
enqueued_at: 1540607026.4979043,
error_class: HTTP::TimeoutError,
error_message: Timed out after using the allocated 13 seconds,
failed_at: 1540590758.4266324,
jid: b4c7a68c45b7aebcf7c2f577,
queue: default,
retried_at: 1540600397.5804272,
retry: True,
retry_count: 2
},
}
What are the pros and cons of not sending them to Sentry at all vs sending them to Sentry but not being alerted?

Summary
An option that has worked well for me is by configuring Sentry's should_capture alongside Sidekiq's sidekiq_retries_exhausted with a custom attribute on the exception.
Details
1a. Add the custom attribute
You can add a custom attribute to an exception. You can define this on any error class with attr_accessor:
class SomeError
attr_accessor :ignore
alias ignore? ignore
end
1b. Rescue the error, set the custom attribute, & re-raise
def perform
# do something
rescue SomeError => e
e.ignore = true
raise e
end
Configure should_capture
should_capture allows you to capture exceptions when they meet a defined criteria. The exception is passed to it, on which you can access the custom attribute.
config.should_capture { |e| !e.ignore? }
Flip the custom attribute when retries are exhausted
There are 2 ways to define the behaviour you want to happen when a job dies, depending on the version of Sidekiq being used. If you want to apply globally & have sidekiq v5.1+, you can use a death handler. If you want to apply to a particular worker or have less than v5.1, you can use sidekiq_retries_exhausted.
sidekiq_retries_exhausted { |_job, ex| ex.ignore = false }

You can filter out the entire event if the retry_count is < N (can be done inside that sidekiq worker you posted). You will loose the data on how often this happens without alerting, but the alerts themselves will not be too noisy.
class SentryWorker < ApplicationWorker
sidekiq_options queue: :default
def perform(event)
retry_count = event.dig(:extra, :sidekiq, :job, retry_count)
if retry_count.nil? || retry_count > N
Raven.send_event(event)
end
end
end
Another idea is to set a different fingerprint depending on whether this is a retry or not. Like this:
class MyJobProcessor < Raven::Processor
def process(data)
retry_count = event.dig(:extra, :sidekiq, :job, retry_count)
if (retry_count || 0) < N
data["fingerprint"] = ["will-retry-again", "{{default}}"]
end
end
end
See https://docs.sentry.io/learn/rollups/?platform=javascript#custom-grouping
I didn't test this, but this should split up your issues into two, depending on whether sidekiq will retry them. You can then ignore one group but can still look at it whenever you need the data.

A much cleaner approach if you are trying to ignore exceptions belonging to a certain class is to add them to your config file
config.excluded_exceptions += ['ActionController::RoutingError', 'ActiveRecord::RecordNotFound']
In the above example, the exceptions Rails uses to generate 404 responses will be suppressed.
See the docs for more configuration options

From my point of view, the best option is Sentry holds all the exceptions and you could modify Sentry and set alerts to send or not the exceptions to the Slack.
In order to configure the Alerts in Sentry: In the sentry account, you could go to the ALerts option in the main menu.
In the following picture I configure an alert to only send to slack a notification if occurs an Exception of type ControllerException more than 10 times
Using this alert we only receive the notification in Slack when all conditions are accomplished

Related

How to create and keep serialport connection in Ruby on Rails, handle infinity loop to create model with new messages?

I want to listening SerialPort and when message occurs then get or create Log model with id received from my device.
How to load once automatically SerialPort and keep established connection and if key_detected? in listener deal with Log model?
This is my autoloaded module in lib:
module Serialport
class Connection
def initialize(port = "/dev/tty0")
port_str = port
baud_rate = 9600
data_bits = 8
stop_bits = 1
parity = SerialPort::NONE
#sp = SerialPort.new(port_str, baud_rate, data_bits, stop_bits, parity)
#key_parts = []
#key_limit = 16 # number of slots in the RFID card.
while true do
listener
end
#sp.close
end
def key_detected?
#key_parts << #sp.getc
if #key_parts.size >= #key_limit
self.key = #key_parts.join()
#key_parts = []
true
else
false
end
end
def listener
if key_detected?
puts self.key
# log = Log.find(rfid: self.key).first_or_create(rfid: self.key)
end
end
end
end
Model:
class Log < ActiveRecord::Base
end
I would have written this in a comment, but it's a bit long... But I wonder if you could clarify your question, and I will update my answer as we go:
With all due respect to the Rails ability to "autoload", why not initialize a connection in an initialization file or while setting up the environment?
i.e., within a file in you_app/config/initializers called serial_port.rb:
SERIAL_PORT_CONNECTION = Serialport::Connection.new
Implementing an infinite loop within your Rails application will, in all probability, hang the Rails app and prevent it from being used as a web service.
What are you trying to accomplish?
If you just want to use active_record or active_support, why not just include these two gems in a separate script?
Alternatively, consider creating a separate thread for the infinite loop (or better yet, use a reactor (They are not that difficult to write, but there are plenty pre-written in the wild, such as Iodine which I wrote for implementing web services)...
Here's an example for an updated listener method, using a separate thread so you call it only once:
def listener
Thread.new do
loop { self.key while key_detected? }
# this will never be called - same as in your code.
#sp.close
end
end

How to implement RPC with RabbitMQ in Rails?

I want to implement an action that calls remote service with RabbitMQ and presents returned data. I implemented this (more as a proof of concept so far) in similar way to example taken from here: https://github.com/baowen/RailsRabbit and it looks like this:
controller:
def rpc
text = params[:text]
c = RpcClient.new('RPC server route key')
response = c.call text
render text: response
end
RabbitMQ RPC client:
class RpcClient < MQ
attr_reader :reply_queue
attr_accessor :response, :call_id
attr_reader :lock, :condition
def initialize()
# initialize exchange:
conn = Bunny.new(:automatically_recover => false)
conn.start
ch = conn.create_channel
#x = ch.default_exchange
#reply_queue = ch.queue("", :exclusive => true)
#server_queue = 'rpc_queue'
#lock = Mutex.new
#condition = ConditionVariable.new
that = self
#reply_queue.subscribe do |_delivery_info, properties, payload|
if properties[:correlation_id] == that.call_id
that.response = payload.to_s
that.lock.synchronize { that.condition.signal }
end
end
end
def call(message)
self.call_id = generate_uuid
#x.publish(message.to_s,
routing_key: #server_queue,
correlation_id: call_id,
reply_to: #reply_queue.name)
lock.synchronize { condition.wait(lock) }
response
end
private
def generate_uuid
# very naive but good enough for code
# examples
"#{rand}#{rand}#{rand}"
end
end
A few tests indicate that this approach works. On the other hand, this approach assumes creating a client (and subscribing to the queue) for every request on this action, which is inefficient according to the RabbitMQ tutorial. So I've got two questions:
Is it possible to avoid creating a queue for every Rails request?
How will this approach (with threads and mutex) interfere with my whole Rails environment? Is it safe to implement things this way in Rails? I'm using Puma as my web server, if it's relevant.
Is it possible to avoid creating a queue for every Rails request?
Yes - there is no need for every single request to have it's own reply queue.
You can use the built-in direct-reply queue. See the documentation here.
If you don't want to use the direct-reply feature, you can create a single reply queue per rails instance. You can use a single reply queue, and have the correlation id help you figure out where the reply needs to go within that rails instance.
How will this approach (with threads and mutex) interfere with my whole Rails environment? Is it safe to implement things this way in Rails?
what's the purpose of the lock / mutex in this code? doesn't seem necessary to me, but i'm probably missing something since i haven't done ruby in about 5 years :)

Timeout in a delayed job

I have some code that potentially can run for a longer period of time. However if it does I want to kill it, here is what I'm doing at the moment :
def perform
Timeout.timeout(ENV['JOB_TIMEOUT'].to_i, Exceptions::WorkerTimeout) { do_perform }
end
private
def do_perform
...some code...
end
Where JOB_TIMEOUT is an environment variable with value such as 10.seconds. I've got reports that this still doesn't prevent my job from running longer that it should.
Is there a better way to do this?
I believe delayed_job does some exception handling voodoo with multiple retries etc, not to mention that I think do_perform will return immediately and the job will continue as usual in another thread. I would imagine a better approach is doing flow control inside the worker
def perform
# A nil timeout will continue with no timeout, protect against unset ENV
timeout = (ENV['JOB_TIMEOUT'] || 10).to_i
do_stuff
begin
Timeout.timeout(timeout) { do_long_running_stuff }
rescue Timeout::Error
clean_up_after_self
notify_business_logic_of_failure
end
end
This will work. Added benefits are not coupling delayed_job so tightly with your business logic - this code can be ported to any other job queueing system unmodified.

Testing error callback from Delayed::Job with RSpec

all,
I have a custom Delayed::Job setup that uses the the success and error callbacks to change the attributes of the object that is being modified in the background. This object is interacting with an external API. To test this, I'm using RSpec with VCR to record external API interactions.
Here's my worker:
class SuperJob < Struct.new(:Thingy_id)
include JobMethods
def perform
thing = Thingy.find(Thingy_id)
run_update(thing)
end
def success(job)
thing = Thingy.find_by_job_id(job.id)
thing.update(job_finished_at: Time.now, job_id: nil)
end
def error(job, exception)
thing = Thingy.find_by_job_id(job.id)
thing.update(job_id: -1, disabled: true)
end
end
Here are my DJ settings:
Delayed::Worker.delay_jobs = !Rails.env.test?
Delayed::Worker.max_run_time = 2.minutes
I've successfully used RSpec to test the results of the success callback. What I'd like to do is test the results of the error callback. The external API doesn't have any particular length limit on the time of the response, to for my app I'd like to limit the maximum wait time to 2 minutes (as seen in the max_run_time setting for DJ).
Now, how do I test that? The API isn't returning a timeout, so I'm not sure how I need to handle this in VCR. The DJ job isn't running in a queue and I don't particularly want the suite to delay for 2 minutes on every run.
Thoughts or suggestions would be greatly appreciated! Thanks!

Resolv::DNS - How to handle timeouts, errors

I'm using the following function in Ruby on Rails:
def isGoogleEmailAddress?(email_domain)
Resolv::DNS.open({:nameserver=>["8.8.8.8"]}) do |r|
mx = r.getresources(email_domain,Resolv::DNS::Resource::IN::MX)
if mx.any? {|server| server.exchange.to_s.downcase.include? "google"} then
return true
end
return false
end
end
Is there a way to handle the issue where Resolv fails, timeouts, errors etc?
Look through the documentation for the Resolv class and add exception handlers for the various errors/exceptions the class can raise.
They're easy to pick out. Look for classes ending in error and timeout.

Resources