How to know what exceptions to rescue? - ruby-on-rails

I often find myself without knowing what exceptions to rescue when using a specific library of code in Ruby.
For instance, I often use HTTParty for any HTTP requests my rails/sinatra app would make. I dug around the code for HTTParty and found a file containing the defined exceptions used. Great! I'll just rescue them when making a request.
To test it out, I put in a bogus domain name for the request, but instead of HTTParty::ResponseError exception I expected, I instead get got a SocketError exception.
What is the best way to deal with this? I'm aware that HTTParty is a wrapper for Ruby's implementation, and that's probably what threw the SocketError exception. But how would I know that normally?
I could solve this by just rescuing "Exception", but that's pretty awful practice. I'd rather be well aware of the exceptions I could be causing and dealing with those.
EDIT: I should clarify that what really prompted me to create this question was that I have no idea how I can figure out the possible exceptions that CAN be raised when calling a specific function... that is, without looking through every single function call in the stack.

In general terms (I'm not a ruby programmer) I use the following approach.
I deal with exceptions in the following way:
Can I recover from it? If the exception can happen and I know I can recover or retry perhaps, then I handle the exception.
Does it need to be reported? If the exception can happen but I know I can't recover or retry perhaps, then I handle the exception by logging it and then passing it on to the caller. I always do this on natural subsystem boundary like major module or services. Sometimes (dependant on the API) I might wrap the exception with a 'my module' specific one so that the caller only has deal with my exceptions.
Can't handle it? All exceptions that are not dealt with should be caught at the top level and (a) reported, (b) ensure that the system remains stable and consistent. This is the one that should always be there regardless of whether the other two are done.
Of course there is another class of exception - the ones that are so serious that they give you no chance to deal with them. For these there is only one solution -Post Mortem debugging and I find the best thing for this is logs, logs and more logs. And having worked on many system from small to large, I would prefer to sacrifice performance for stability and recoverability (except where it's critical) and add copious amounts of logging - introspectively if possible.

A socketError response is totally fine if you put in a bogus domain name.
After all - trying to connect to a non-existant domain would cause the connection to fail AKA SocketError.
The best way to deal with that is to use a valid Domain with a false URL in your test, but catch socketError in your live code.
The problem here is not that you're catching the wrong exception but that you're priming the test with bad data.
The best course of action is understand what exceptions could happen and manage them,
When I say understand, I'm getting at - Where does the URL come from, is it entered by your user ? if so never trust it and catch everything. Does it come from your config data; Semi trust it and log errors, unless it's mission critical that the URL is ok.
There's no right or wrong answer here but this approach will, I hope, give you a good result.
Edit: What I'm attempting to do here is advocate the mindset of a programmer that is aware of the results of their actions. We all know that trying to connect to 'thisServiceIsTotallyBogus.somethingbad.notAvalidDomain' will fail; but the mindset of a programmer should be to first validate exactly where that domain comes from. If it is inputted by the user then you must assume full checks; if you know it's from a config file only accessed by yourself or your support team; you can relax a little; Sadly though, this is a bad example as you should really always test URLS because sometimes the internet doesn't work!
Ideally, the developer documentation for anything you use should tell you what exceptions that it can throw.

For libraries or gems where the source code is publicly available, you can typically find the types of exceptions in an exceptions.rb file. (Ref here). Otherwise you will have to rely on the documentation. If all else fails you can rescue StandardError, although it is a less-than-ideal practice in many cases (ref this SO answer.

Related

An Rspec helper like assert_database_unchanged?

Is there an rspec extension for postgresql that allows one to test something like this?
expect { make_bad_request }.to not_change_database
i.e. To ensure nothing was created, updated or deleted.
Sure, we can check a specific table but most often we just want to be sure that nothing changed at all, nothing sneaked in some multi-stage save.
It's not particularly easy to do with a little helper because although postgres has pg_stat_database it's not updated during the test transaction. I can still see how it's doable but it would be a bit of plumbing. Has anyone done it?
UPDATE:
I was asked to give an example of how this might be useful.
The convention with HTTP is that if a request returns an error status then no change has been made to the application state. Exceptions to that convention are rare and we like convention over configuration.
Active record helps with enforcing this with defaults about how validation works but it still leaves lots of ways to make mistakes, particularly with complex chains of events where it's most important to have atomicity.
As such, to enforce the HTTP convention with ease you could take it even further than as stated above and instead have something like a directive expressed as something like
describe 'error responses', :database_changes_disallowed do
context 'invalid form data' do
before do
...setup belongs only here
end
it 'returns 422' do
...
end
end
end
Rspec is already able to use database transactions to isolate state changes on a per-example level, this would aim to subdivide just one further between the before and the it.
This will work for a well designed app if you have been judicious enough to ensure that your database stores only application state and no pseudo-logging like User#last_active_at. If you haven't then you'll know immediately.
It would greatly increase test case coverage against the some of the worst kind of state corruption bugs whilst needing less code and removing some testing complexity. Cases where a database change suddenly is made for a previously working test would be as a result of an architectural change in an unfortunate direction, a real and unusual need to make an exception or a serious bug.
I'll be sad if it turns out to be technically infeasible to implement but it doesn't seem a bad idea in terms of application design.
That's a tricky one, because it's not easy to tell what should not happen in your app. IMO better to keep the focus of your specs on what the app should do.
In other words: if you want to test that no DB changes were made, should you check that no files were written? And no requests were made? Should you test that no files permissions have been changed?
I guess you get my point.
But, there might be a legit reasons to do it I don't know about. In such case, I'd use something like db-query-matcher
expect { your_code_here }.not_to make_database_queries(manipulative: true)
I usually used and and seen it being used for N+1 tests (when you want to specify how many times a specific query is called), but it seems this matcher would work for you as well.
But it can be very brittle: if you add such checks to most of your tests, and your app is evolving, you can have a failing specs just because some actions started to need a DB update. Your call.
I think you are looking for
assert_no_changes(expressions, message = nil, &block)
https://api.rubyonrails.org/v6.0.2.1/classes/ActiveSupport/Testing/Assertions.html#method-i-assert_no_changes

When to "let it crash" and when to defend the code in Erlang?

So, with "let it crash" mantra Erlang code is meant to be resistant to cruel world events like unexpected pulling the plug, hardware failure, and unstable network connections.
On the other hand, there is defensive programming.
Being new to Erlang, I wonder, how to know when I want the process just crash and when I want it to defend the flow with if, case..of, type guards?
Say, I have an authentication module, which can return true/false result if authenticated successfully or not. Should it just have successful scenario and crash if the user authentication fails due to wrong login/password?
What about other scenarios, like, if a product is not found in the database, or search results are empty?
I suppose, I can't ignore defensive constructs completely, as any guards are in their nature to defend the "normal" flow of the app?
Is there a rule of thumb when to defend and when to crash?
As Fred Hebert says at http://ferd.ca/the-zen-of-erlang.html -
If I know how to handle an error, fine, I can do that for that
specific error. Otherwise, just let it crash!
I'd say that authentication errors, empty search results, etc., are expected errors and those that warrant an appropriate response to the user.
I don't think there is actually a Rule of Thumb in this case.
As I see it, whenever you know how to handle an expected error - handle it. In case of authentication, I don't really think it's an actual error, it's a normal behavior so go ahead and write few lines of code to handle this specific case.
In contrast, network failure is something that might happen for various of reasons, they are not actually expected as part of your code normal behaviour, so in this case I would go with the "let it crash" philosophy.
Anyway, when going with let it crash - you of course still need to handle the case where the process crashed (i.e. using links and monitors and restarting the process).
Please check also this very good answer. And you may read more about it here and here.

What's the deal with catching exceptions?

So I've read a lot about catching exceptions. Let's talk about this and iOS together. I've used it with Google Analytics to submit information about the crash and using that to fix bugs.
But this raises a question. Can catching these exceptions help prevent apps from crashing. Can you theoretically prevent that bit of code from crashing the app and keep the app open. Now I get the fact that it would probably be impossible to do if there was no memory to be used but it would still be nice to know about.
Sorry if this sounds like a stupid questions and I really should read more about it and do some more research. Any information would be helpful.
I do have a fairly decent knowledge of iOS obj-c for my age and am willing to look into what you have to say.
Thanks!
Exceptions on iOS should never be caught; they are fatal for a reason. Unlike most languages that have a rich exception hierarchy and multiple means of throwing/catching exceptions for the benefit of the program as a whole, Cocoa-Touch code is built around the principle that all exceptions are fatal. It is a mistake to think that you can catch an exception thrown through any frames of Apple-provided code and have your process continue unhindered. It is an even more grave mistake to catch and rethrow the exception for the purpose of logging.
The exceptions thrown by Cocoa-Touch indicate serious errors in program logic, or undefined and unresolvable state in an object. It is not OK to ignore them, or log them after catching them. They must be fixed and prevented from being thrown in the first place in order to truly guarantee your process remains stable.

Rhino eTL: Join operation with orphan rows

I'm using rhino ETL for the first time in a project, and I'm very
impressed by its capabilities. I use a join-operation to match two
datasources.
Sometimes there might be missing data, so I override LeftOrphanRow to
"log" the error. So I though I would throw an exception and then at
the end of the process collect all occured exceptions using
GetAllErrors().
But as it seems the process is being aborted with the first exception.
Is that intentionally? What would be the best way to deal with
OrphanRows (especially when I would like to have a summary of all orphan rows for all operations at the end of the process)?
Seems to me that the problem is that you're trying to use exceptions to report a non-exceptional event. That's not really what exceptions are for, and certainly when you're expecting the exception to pass through a third-party library, you shouldn't rely on that library to behave in any specific way with respect to that exception.
Can you just keep a list of orphan rows somewhere, e.g. globally, and add to it whenever you encounter one in any of your join operations? Then after your EtlProcess is finished, just print the list out. You might also consider using log4net to accomplish this. Or even simply raising an event, that you subscribe to elsewhere and do whatever seems appropriate.

What information should I be logging in my web app?

I finishing up a web application and I'm trying to implement some logging. I've never seen any good examples of what to log. Is it just exceptions? Are there other things I should be logging? What type of information do you find useful for finding and fixing bugs.
Looking for some specific guidance and best practices.
Thanks
Follow up
If I'm logging exceptions what information specifically should I be logging? Should I be doing something more than _log.Error(ex.Message, ex); ?
Here is my logical breakdown of what can be logged within and application, why you might want to and how you might go about doing it. No matter what I would recommend using a logging framework such as log4net when implementing.
Exception Logging
When everything else has failed, this should not. It is a good idea to have a central means of capturing all unhanded exceptions. This shouldn't
be much harder then wrapping your entire application in a giant try/catch unless you are using more than on thread. The work doesn't end here
though because if you wait until the exception reaches you a lot of useful information would have gone out of scope. At the very least you should
try to collect specific pieces of the application state that are likely to help with debugging as the stack unwinds. Your application should always be prepared to produce this type of log output, especially in production. Make sure to take a look at ELMAH if you haven't already. I haven't tried it but I have heard great things
Application Logging
What I call application logs includes any log that captures information about what your application is doing on a conceptual level such as "Deleted Order" or "A User Signed On". This kind of information can be useful in analyzing trends, auditing the system, locking it down, testing, security and detecting bugs of coarse. It is probably a good idea to plan on leaving these logs on in production as well, perhaps at variable levels of granularity.
Trace Logging
Trace logging, to me, represents the most granular form of logging. At this level you focus less on what the application is doing and more on how it is doing it. This is one step above actually walking through the code line by line. It is probably most helpful in dealing with concurrency issues or anything for that matter which is hard to reproduce. You wouldn't want to always have this running, probably only turning it on when needed.
Lastly, as with so many other things that usually only get addressed at the very end, the best time to think about logging is at the beginning of a project so that the application can be designed with it in mind. Great question though!
Some things to log:
business actions, such as adding/deleting items. Talk to your app's business owner to come up with a list of things that are useful. These should make sense to the business, not to you (for example: when user submitted report, when user creates a new process, etc)
exceptions
exceptions
exceptions
Some things to NOT to log:
do not log information simply for tracking user usage. Use an analytics tool for that (which tracks the client in javascirpt, not in the client)
do not track passwords or hashes of passwords (huge security issue)
Maybe you should log page/resource accesses which are not yet defined in your application, but are requested by clients. That way, you may be able to find vulnerabilities.
It depends on the application and its audience. If you are managing sales or trading stocks, you probably should log more info than say a personal blog. When you need the log most is when an error is happening in your production environment, but can't reproduce it locally. Having log level and log hierarchy would help in such situations, because you can dynamically increase the log level. See log4j's documentation and log4net.
My few cents.
Besides using log severity and exceptions properly, consider structuring your log statements so that you could easily look though the log data in the future. For example - extracting meaningful info quickly, doing queries etc. There is no problem to generate an ocean of log data, the problem is to convert this data into information. So, structuring and defining it beforehand helps in later usage. If you use log4j, I would also suggest using mapped diagnostic context (MDC) - this helps a lot for tracking session contexts. Aside from trace and info, I would also use debug level where I usually keep temp. items. Those could be filtered out or disabled when not needed.
You probably shouldn't be thinking of this at this stage, rather, logging is helpful to consider at every stage of development to help diffuse potential bugs before they arise. Depending on your program, I would try to capture as much information as possible. Log everything. You can always stop logging certain components or processes if you don't reference that data enough. There is no such thing as too much information.
From my (limited) experience, if you don't want to make a specific error table for each possible error type, construct a generic database table that accepts general information as well as a string that you can populate with exception data, confirmation messages during successful yet important processes, etc. I've used a generic function with parameters for this.
You should also consider the ability to turn logging off if necessary.
Hope this helps.
I beleive when you log an exception you should also save current date and time, requested url, url refferer and user IP address.

Resources