Rescue from single resource exceptions in Rails index action - ruby-on-rails

For the show action, returning a 500 error on uncaught exceptions is fine, but for the index action, it would be helpful if a single bad resource didn't cause the whole request to fail. Is there a way to rescue from those exceptions and return the rest of the resources along with a list of errors?
Details: I'm using RABL to render JSON templates like this (but I think the solution is likely general rather than specific to this):
# app/controllers/happenings_controller.rb
def index
#happenings = current_person.happenings
end
# app/views/happening/index.json.rabl
collection #happenings
extends 'happenings/show'
# app/views/happening/show.json.rabl
object #happening
attributes :id, :name, :description
node :creator, if: lambda { |s| s.creator? } do |s|
# !!! This is where an exception on a single resource was blowing up the whole request
partial("people/show", :object => s.creator)
end

Sure! Just rescue the exception.
So, this answer probably does not help you at all, but it is just because you did not provide any meaningful description about your problem.
It all depends on where your errors occur. Do you want to rescue from database queries, rendering errors, environment problems?
Depending on the cause of the error there are different solutions like monkey-patching rabl, a simple rescue in your controller or some middleware handling the error.

Related

What is the use of ! in rails

What is the use of ! in rails?
Especially in this line: From HArtl tutorial
users = User.order(:created_at).take(6)
50.times do
content = Faker::Lorem.sentence(5)
user.each { |user| user.microposts.create!( content: content )}
end
Basically this is creating tweets/microposts for 6 users.
I am really wondering why need to use !
The important thing to remember is that in Ruby a trailing ! or ? are allowed on method names and become part of the method name, not a modifier added on. x and x! and x? are three completely different methods.
In Ruby the convention is to add ! to methods that make in-place modifications, that is they modify the object in fundamental ways. An example of this is String#gsub which returns a copy, and String#gsub! which modifies the string in-place.
In Rails this has been ported over to mean that as well as situations where the method will raise an exception on failure instead of returning nil. This is best illustrated here:
Record.find_by(id: 10) # => Can return nil if not found
Record.find_by!(id: 10) # => Can raise ActiveRecord::RecordNotFound
Note that this is not always the case, as methods like find will raise exceptions even without the !. It's purely an informational component built into the method name and does not guarantee that it will or won't raise exceptions.
Update:
The reason for using exceptions is to make flow-control easier. If you're constantly testing for nil, you end up with highly paranoid code that looks like this:
def update
if (user.save)
if (purchase.save)
if (email.sent?)
redirect_to(success_path)
else
render(template: 'invalid_email')
end
else
render(template: 'edit')
end
else
render(template: 'edit')
end
end
In other words, you always need to be looking over your shoulder to be sure nothing bad is happening. With exceptions it looks like this:
def update
user.save!
purchase.save!
email.send!
redirect_to(success_path)
rescue ActiveRecord::RecordNotFound
render(template: 'edit')
rescue SomeMailer::EmailNotSent
render(template: 'invalid_email')
end
Where you can see the flow is a lot easier to understand. It describes "exceptional situations" as being less likely to occur so they don't clutter up the main code.

Can I put retry in rescue block without begin ... end?

I ran into a problem, when PG fails out of sync (well known problem)(example).
PG fails out of sync, sequence of id stops incrementing and raises ActiveRecord::RecordNotUnique error.
But all solutions proposed here (all I found) propose some manual solutions - either do some operations in console, either run custom rake task.
However, I find this unsatisfying for production: each times it happens, users get 500, while someone administrating server should operatively save the day. (And according to test data for some reason it possible will occur frequently in my case).
So I would like rather to patch ActiveRecord Base class to catch this specific error and rescue it.
I use this logic sometimes in controller:
class ApplicationController < ActionController::Base
rescue_from ActionController::ParameterMissing, ActiveRecord::RecordNotFound do |e|
# some logic here
end
end
However, here I don't need retry. Also, I would like to not to go deep in monkey patching, for example, without overriding Base create method.
So I was thinking of something like this:
module ActiveRecord
class Base
rescue ActiveRecord::RecordNotUnique => e
if e.message.include? '_pkey'
table =e.message.match(//) #regex to define table
ActiveRecord::Base.connection.reset_pk_sequence!(table)
retry
else
raise
end
end
end
But it most likely doesn't work, as I'm not sure if Rails/Ruby will understand what exactly it asked to retry.
Is there any solution?
P.S. Not related solution for overall problem of sequence which will work without manual command line commands and having unserved users are also appreciated.
To answer the question you're asking, no. rescue can only be used from within a begin..end block or method body.
begin
bad_method
rescue SomeException
retry
end
def some_method
bad_method
rescue SomeException
retry
end
rescue_from is just a framework helper method created because of how indirect the execution is in a controller.
To answer the question you're really asking, sure. You can override create_or_update with a rescue/retry.
module NonUniquePkeyRecovery
def create_or_update(*)
super
rescue ActiveRecord::RecordNotUnique => e
raise unless e.message.include? '_pkey'
self.class.connection.reset_pk_sequence!(self.class.table_name)
retry
end
end
ActiveSupport.on_load(:active_record) do
include NonUniquePkeyRecovery
end

ActiveRecord::RecordNotUnique global handling

I am trying to figure out a graceful way to handle a ActiveRecord::RecordNotUnique exception globally for all my ActiveRecord code. I know about the validates_uniqueness_of validation but I want to rescue the exception directly as I have a constraint on the database in order to avoid bad data due to race conditions. I also don't want to create a bunch of custom methods that directly handle the exception every time I want to save or update an object where this constraint can be violated.
I would prefer not to monkey patch ActiveRecord methods like save() but I am beginning to think that achieving graceful exception handling for all ActiveRecord objects in my code might require that. Below is some code that demonstrates what a solution would look like:
class Photo < ActiveRecord::Base
belongs_to :post
def save(*args)
super
rescue ActiveRecord::RecordNotUnique => error
errors[:base] << error.message
false
end
end
While this works if I call save directly on a Photo object it won't work if I save the object through another model using accepts_nested_attributes_for with validates_associated.
Any help would be greatly apprecaited.
Thanks
Update
The desired outcome is to handle the exception and just add a key/value pair to the object's errors hash and then display form errors back to the user telling them that the email has been taken.
This is covered in the Action Controller Overview Rails Guide. In short, you can use the rescue_from method to register a handler for exceptions. If you use it in ApplicationController then it'll be inherited by all other controllers.
Here's the example from the Guide:
class ApplicationController < ActionController::Base
rescue_from ActiveRecord::RecordNotFound, with: :record_not_found
private
def record_not_found
render plain: "404 Not Found", status: 404
end
end
Go take a look for more information and an additional example.
What I was looking for was the inverse_of option when defining the association on each of the models. What inverse_of does is it causes rails to use the in memory instance of the associated object as opposed to going to the db to fetch the record. I created a save method in the Photo model that looks like this:
def save(*args)
super
rescue ActiveRecord::RecordNotUnique => error
post.errors[:base] << "You can only have one photo be your header photo"
false
end
In the rescue block, when I call post.errors I am getting the unsaved, associated post object rather than rails looking for one in the db based on photo.post_id which at this point is nil because the photo object is invalid which caused the post not to be persisted to the db.
Here are the docs for inverse_of
http://guides.rubyonrails.org/association_basics.html#bi-directional-associations
My gem activerecord-transactionable allows you to do all of the following (a la carte):
rescue
retry with alternate logic (switch from create to find, for example)
add errors to the record that failed to save
use locking
use nested transactions
handle different kinds of database errors in different ways

How should my scraping "stack" handle 404 errors?

I have a rake task that is responsible for doing batch processing on millions of URLs. Because this process takes so long I sometimes find that URLs I'm trying to process are no longer valid -- 404s, site's down, whatever.
When I initially wrote this there was basically just one site that would continually go down while processing so my solution was to use open-uri, rescue any exceptions produced, wait a bit, and then retry.
This worked fine when the dataset was smaller but now so much time goes by that I'm finding URLs are no longer there anymore and produce a 404.
Using the case of a 404, when this happens my script just sits there and loops infinitely -- obviously bad.
How should I handle cases where a page doesn't load successfully, and more importantly how does this fit into the "stack" I've built?
I'm pretty new to this, and Rails, so any opinions on where I might have gone wrong in this design are welcome!
Here is some anonymized code that shows what I have:
The rake task that makes a call to MyHelperModule:
# lib/tasks/my_app_tasks.rake
namespace :my_app do
desc "Batch processes some stuff # a later time."
task :process_the_batch => :environment do
# The dataset being processed
# is millions of rows so this is a big job
# and should be done in batches!
MyModel.where(some_thing: nil).find_in_batches do |my_models|
MyHelperModule.do_the_process my_models: my_models
end
end
end
end
MyHelperModule accepts my_models and does further stuff with ActiveRecord. It calls SomeClass:
# lib/my_helper_module.rb
module MyHelperModule
def self.do_the_process(args = {})
my_models = args[:my_models]
# Parallel.each(my_models, :in_processes => 5) do |my_model|
my_models.each do |my_model|
# Reconnect to prevent errors with Postgres
ActiveRecord::Base.connection.reconnect!
# Do some active record stuff
some_var = SomeClass.new(my_model.id)
# Do something super interesting,
# fun,
# AND sexy with my_model
end
end
end
SomeClass will go out to the web via WebpageHelper and process a page:
# lib/some_class.rb
require_relative 'webpage_helper'
class SomeClass
attr_accessor :some_data
def initialize(arg)
doc = WebpageHelper.get_doc("http://somesite.com/#{arg}")
# do more stuff
end
end
WebpageHelper is where the exception is caught and an infinite loop is started in the case of 404:
# lib/webpage_helper.rb
require 'nokogiri'
require 'open-uri'
class WebpageHelper
def self.get_doc(url)
begin
page_content = open(url).read
# do more stuff
rescue Exception => ex
puts "Failed at #{Time.now}"
puts "Error: #{ex}"
puts "URL: " + url
puts "Retrying... Attempt #: #{attempts.to_s}"
attempts = attempts + 1
sleep(10)
retry
end
end
end
TL;DR
Use out-of-band error handling and a different conceptual scraping model to speed up operations.
Exceptions Are Not for Common Conditions
There are a number of other answers that address how to handle exceptions for your use case. I'm taking a different approach by saying that handling exceptions is fundamentally the wrong approach here for a number of reasons.
In his book Exceptional Ruby, Avdi Grimm provides some benchmarks showing the performance of exceptions as ~156% slower than using alternative coding techniques such as early returns.
In The Pragmatic Programmer: From Journeyman to Master, the authors state "[E]xceptions should be reserved for unexpected events." In your case, 404 errors are undesirable, but are not at all unexpected--in fact, handling 404 errors is a core consideration!
In short, you need a different approach. Preferably, the alternative approach should provide out-of-band error handling and prevent your process from blocking on retries.
One Alternative: A Faster, More Atomic Process
You have a lot of options here, but the one I'm going to recommend is to handle 404 status codes as a normal result. This allows you to "fail fast," but also allows you to retry pages or remove URLs from your queue at a later time.
Consider this example schema:
ActiveRecord::Schema.define(:version => 20120718124422) do
create_table "webcrawls", :force => true do |t|
t.text "raw_html"
t.integer "retries"
t.integer "status_code"
t.text "parsed_data"
t.datetime "created_at", :null => false
t.datetime "updated_at", :null => false
end
end
The idea here is that you would simply treat the entire scrape as an atomic process. For example:
Did you get the page?
Great, store the raw page and the successful status code. You can even parse the raw HTML later, in order to complete your scrapes as fast as possible.
Did you get a 404?
Fine, store the error page and the status code. Move on quickly!
When your process is done crawling URLs, you can then use an ActiveRecord lookup to find all the URLs that recently returned a 404 status so that you can take appropriate action. Perhaps you want to retry the page, log a message, or simply remove the URL from your list of URLs to scrape--"appropriate action" is up to you.
By keeping track of your retry counts, you could even differentiate between transient errors and more permanent errors. This allows you to set thresholds for different actions, depending on the frequency of scraping failures for a given URL.
This approach also has the added benefit of leveraging the database to manage concurrent writes and share results between processes. This would allow you to parcel out work (perhaps with a message queue or chunked data files) among multiple systems or processes.
Final Thoughts: Scaling Up and Out
Spending less time on retries or error handling during the initial scrape should speed up your process significantly. However, some tasks are just too big for a single-machine or single-process approach. If your process speedup is still insufficient for your needs, you may want to consider a less linear approach using one or more of the following:
Forking background processes.
Using dRuby to split work among multiple processes or machines.
Maximizing core usage by spawning multiple external processes using GNU parallel.
Something else that isn't a monolithic, sequential process.
Optimizing the application logic should suffice for the common case, but if not, scaling up to more processes or out to more servers. Scaling out will certainly be more work, but will also expand the processing options available to you.
Curb has an easier way of doing this and can be a better (and faster) option instead of open-uri.
Errors Curb reports (and that you can rescue from and do something:
http://curb.rubyforge.org/classes/Curl/Err.html
Curb gem:
https://github.com/taf2/curb
Sample code:
def browse(url)
c = Curl::Easy.new(url)
begin
c.connect_timeout = 3
c.perform
return c.body_str
rescue Curl::Err::NotFoundError
handle_not_found_error(url)
end
end
def handle_not_found_error(url)
puts "This is a 404!"
end
You could just raise the 404's:
rescue Exception => ex
raise ex if ex.message['404']
# retry for non-404s
end
It all just depends on what you want to do with 404's.
Lets assume that you just want to swallow them. Part of pguardiario's response is a good start: You can raise an error, and retry a few times...
# lib/webpage_helper.rb
require 'nokogiri'
require 'open-uri'
class WebpageHelper
def self.get_doc(url)
attempt_number = 0
begin
attempt_number = attempt_number + 1
page_content = open(url).read
# do more stuff
rescue Exception => ex
puts "Failed at #{Time.now}"
puts "Error: #{ex}"
puts "URL: " + url
puts "Retrying... Attempt #: #{attempts.to_s}"
sleep(10)
retry if attempt_number < 10 # Try ten times.
end
end
end
If you followed this pattern, it would just fail silently. Nothing would happen, and it would move on after ten attempts. I would generally consider that a Bad Plan(tm). Instead of just failing out silently, I would go for something like this in the rescue clause:
rescue Exception => ex
if attempt_number < 10 # Try ten times.
retry
else
raise "Unable to contact #{url} after ten tries."
end
end
and then throw something like this in MyHelperModule#do_the_process (you'd have to update your database to have an errors and error_message column):
my_models.each do |my_model|
# ... cut ...
begin
some_var = SomeClass.new(my_model.id)
rescue Exception => e
my_model.update_attributes(errors: true, error_message: e.message)
next
end
# ... cut ...
end
That's probably the easiest and most graceful way to do it with what you currently have. That said, if you're handling that many request in one massive rake tasks, that's not very elegant. You can't restart it if something goes wrong, it's tying up a single process on your system for a long time, etc. If you end up with any memory leaks (or infinite loops!), you find yourself in a place where you can't just say 'move on'. You probably should be using some kind of queueing system like Resque or Sidekiq, or Delayed Job (though it sounds like you have more items that you'd end up queueing than Delayed Job would happily handle). I'd recommend digging in to those if you're looking for a more eloquent approach.
I actually have a rake task that does something remarkably similar. Here is the gist of what I did to deal with 404's and you could apply it pretty easy.
Basically what you want to do is to use the following code as a filter and create a logfile to store your errors. So before you grab the website and process it you first do the following:
So create/instantiate a logfile in your file:
#logfile = File.open("404_log_#{Time.now.strftime("%m/%d/%Y")}.txt","w")
# #{Time.now.strftime("%m/%d/%Y")} Just includes the date into the log in case you want
# to run diffs on your log files.
Then change your WebpageHelper class to something like this:
class WebpageHelper
def self.get_doc(url)
response = Net::HTTP.get_response(URI.parse(url))
if (response.code.to_i == 404) notify_me(url)
else
page_content = open(url).read
# do more stuff
end
end
end
What this is doing is pinging the page for a response code. The if statement I included is checking if the response code is a 404 and if it is run the notify_me method otherwise run your commands as usual. I just arbitrarily created that notify_me method as an example. On my system I have it writing to txt file that it emails me upon completion. You could use a similar method to look at other response codes.
Generic logging method:
def notify_me(url)
puts "Failed at #{Time.now}"
puts "URL: " + url
#logfile.puts("There was a 404 error for the site #{url} at #{Time.now}.")
end
Regarding the problem you're experiencing, you can do the following:
class WebpageHelper
def self.get_doc(url)
retried = false
begin
page_content = open(url).read
# do more stuff
rescue OpenURI::HTTPError => ex
unless ex.io.status.first.to_i == 404
log_error ex.message
sleep(10)
unless retried
retried = true
retry
end
end
# FIXME: needs some refactoring
rescue Exception => ex
puts "Failed at #{Time.now}"
puts "Error: #{ex}"
puts "URL: " + url
puts "Retrying... Attempt #: #{attempts.to_s}"
attempts = attempts + 1
sleep(10)
retry
end
end
end
But I'd rewrite the whole thing in order to do parallel processing with Typhoeus:
https://github.com/typhoeus/typhoeus
where I'd assign a callback block which would do the handling of the returned data, thus decoupling the fetching of the page and the processing.
Something along the lines:
def on_complete(response)
end
def on_failure(response)
end
def run
hydra = Typhoeus::Hydra.new
reqs = urls.collect do |url|
Typhoeus::Request.new(url).tap { |req|
req.on_complete = method(:on_complete).to_proc }
hydra.queue(req)
}
end
hydra.run
# do something with all requests after all requests were performed, if needed
end
I think everyone's comments on this question are spot on and correct. There is alot of good info on this page. Here is my attempt at collecting this very hefty bounty. That being said +1 to all answers.
If you are only concerned with 404 using OpenURI you can handle just those types of exceptions
# lib/webpage_helper.rb
rescue OpenURI::HTTPError => ex
# handle OpenURI HTTP Error!
rescue Exception => e
# similar to the original
case e.message
when /404/ then puts '404!'
when /500/ then puts '500!'
# etc ...
end
end
If you want a bit more you can do different Execption handling per type of error.
# lib/webpage_helper.rb
rescue OpenURI::HTTPError => ex
# do OpenURI HTTP ERRORS
rescue Exception::SyntaxError => ex
# do Syntax Errors
rescue Exception => ex
# do what we were doing before
Also I like what is said in the other posts about number of attempts. Makes sure it isn't an infinite loop.
I think the rails thing to do after a number of attempts would be to log, queue, and or email.
To log you can use
webpage_logger = Log4r::Logger.new("webpage_helper_logger")
# somewhere later
# ie 404
case e.message
when /404/
then
webpage_logger.debug "debug level error #{attempts.to_s}"
webpage_logger.info "info level error #{attempts.to_s}"
webpage_logger.fatal "fatal level error #{attempts.to_s}"
There are many ways to queue.
I think some of the best are faye and resque. Here is a link to both:
http://faye.jcoglan.com/
https://github.com/defunkt/resque/
Queues work just like a line. Believe it or not the Brits call lines, "queues" (The more you know). So, using a queuing server then you can line up many requests and when the server you are trying to send the request comes back, you can hammer that server with your requests in the queue. Thus forcing their server to go down again, but hopefully over time they will upgrade their machines because they keep crashing.
And finally to email, rails also to the rescue (not resque)...
Here is the link to rails guide on ActionMailer: http://guides.rubyonrails.org/action_mailer_basics.html
You could have a mailer like this
class SomeClassMailer < ActionMailer::Base
default :from => "notifications#example.com"
def self.mail(*args)
...
# then later
rescue Exception => e
case e.message
when /404/ && attempts == 3
SomeClassMailer.mail(:to => "broken#example.com", :subject => "Failure ! #{attempts}")
Instead of using initialize, which always returns a new instance of an object, when creating a new SomeClass from a scraping, I'd use a class method to create the instance. I'm not using exceptions here beyond what nokogiri is throwing because it sounds like nothing else should bubble up further since you just want these to be logged, but otherwise be ignored. You mentioned logging the exceptions--are you just logging what goes to stdout? I'll answer as if you are...
# lib/my_helper_module.rb
module MyHelperModule
def self.do_the_process(args = {})
my_models = args[:my_models]
# Parallel.each(my_models, :in_processes => 5) do |my_model|
my_models.each do |my_model|
# Reconnect to prevent errors with Postgres
ActiveRecord::Base.connection.reconnect!
some_object = SomeClass.create_from_scrape(my_model.id)
if some_object
# Do something super interesting if you were able to get a scraping
# otherwise nothing happens (except it is noted in our logging elsewhere)
end
end
end
Your SomeClass:
# lib/some_class.rb
require_relative 'webpage_helper'
class SomeClass
attr_accessor :some_data
def initialize(doc)
#doc = doc
end
# could shorten this, but you get the idea...
def self.create_from_scrape(arg)
doc = WebpageHelper.get_doc("http://somesite.com/#{arg}")
if doc
return SomeClass.new(doc)
else
return nil
end
end
end
Your WebPageHelper:
# lib/webpage_helper.rb
require 'nokogiri'
require 'open-uri'
class WebpageHelper
def self.get_doc(url)
attempts = 0 # define attempts first in non-block local scope before using it
begin
page_content = open(url).read
# do more stuff
rescue Exception => ex
attempts += 1
puts "Failed at #{Time.now}"
puts "Error: #{ex}"
puts "URL: " + url
if attempts < 3
puts "Retrying... Attempt #: #{attempts.to_s}"
sleep(10)
retry
else
return nil
end
end
end
end

Ruby equivalent to PHPs set_error_handler

I have just barely gotten into Ruby / ROR but need to quickly write a class for handling errors and doing something with them. I've been able to find the important examples/tutorials for the rest of what I need but I'm having trouble finding what the best alternative to PHP's "set_error_handler" is.
My goals are:
I'd like to write a class that will capture any ruby-level errors automatically.
I'd like for the class to also be called by the user when there are custom errors/exceptions to report.
I'd like this work for any ruby app, but my main focus is for ruby-on-rails applications as well. Thanks for your advice.
I think the closest equivalent in Rails is rescue_from - it allows you to specify code will catch any given exception (except some template errors - though there are ways round that). If you want, you could then hand it off to some other class. So I guess what you'd do in your case would be:
in app/controllers/application_controller.rb:
class ApplicationController < ActionController::Base
rescue_from Exception do |e|
MyExceptionHandler.handle_exception(e)
end
end
in lib/my_exception_handler.rb:
class MyExceptionHandler
def self.handle_exception exception
# your code goes here
end
end
If that helps, let me know and I'll dig out the link to how you catch template errors.
begin
#require all_kinds_of_things
"abc".size(1,2)
123.reverse
# rest of brilliant app
rescue Exception => e #Custom, catch-all exeption handler
puts "Doh...#{e}"
print "Do you want the backtrace? (Y) :"
puts e.backtrace if gets.chomp == "Y"
end
Define ApplicationController#rescue_in_public(exception) and put your custom handling code there.
This augments Rails' default exception handling at the top level - right before the HTTP response is generated. As your Rails apps grow in complexity and use external resources, there will be more exceptions that you'll want to handle much closer to where the exceptions are thrown, but this can get you started.
This method will only work on HTTP requests and will not catch exceptions in any custom rake tasks you create or code executed via rails runner.
Here's an example from one of my applications:
class ApplicationController < ActionController::Base
...
protected
def rescue_action_in_public (exception)
case exception
when ActionController::InvalidAuthenticityToken
if request.xhr?
render :update do |page|
page.redirect_to '/sessions/new/'
end
else
redirect_to '/sessions/new/'
end
when ActionController::NotImplemented
RAILS_DEFAULT_LOGGER.info("ActionController::NotImplemented\n#{request.inspect}")
render :nothing => true, :status => '500 Error'
else
super
end
end
end

Resources