Rails formatting logs to use with aws-logs and CloudWatch - ruby-on-rails

AWS has this very cool log collection tool using aws-logs
However, I do not understand how I can format my log / configure the tool to be smarter and regroup the same error message. Right now AWS shows one message per line (because every line is timestamped)
My current log configuration indeed captures one new log entry per message. How can I go around it
[rails/production.log]
file = /var/www/xxx/shared/log/production.log
log_group_name = /rails/production.log
log_stream_name = {instance_id}
time_zone = LOCAL
datetime_format = %Y-%m-%dT%H:%M:%S

I actually partly solved the problem using lograge and JSON output which is parsed correctly by Amazon and lets your regroup most requests correctly.
However I still have some problems with errors, which are not outputted the same way, and still generate one line per call stack trace on awslogs
EDIT : We are now using a Rails API and regular exceptions thrown during JSON requests are rescued with a json:api error handler renderer. Furthermore, we are using Rollbar to log actual errors, so it becomes irrelevant to have the full error log
In our API::ApplicationController
# We don't want error reports for those errors
RESCUABLE_ERRORS = [
ActionController::ParameterMissing,
ActiveModel::ForbiddenAttributesError,
StrongerParameters::InvalidParameter,
Mongoid::Errors::Validations,
Mongoid::Errors::DocumentNotFound
]
# Note that in tests, we want to actually do not want to rescue non-Runtime exceptions straight away because most likely this indicates a real bug that you should fix, but in production we want to rescue any error so the frontend does not get the default HTML response but a JSON:api error
rescue_from(Rails.env.test? ? RuntimeError : Exception) do |e|
handle_exception(e)
notify_exception(e, 'Rescued from API controller - Rendering JSONAPI Error')
end
rescue_from(*RESCUABLE_ERRORS) do |e|
handle_exception(e)
end
In our controllers that inherit API::ApplicationController, we add as many lines of rescue_from depending whether we want to report the exception as an error (notify_exception) or just convert to a JSON payload (handle_exception)
rescue_from(SPECIFIC_ERROR_CLASS) do |exception|
handle_exception(exception) # will render a json:api error payload
# notify_exception(exception) # Optional : ExceptionNotifier to broadcast the error to email/Rollbar, etc. if this error should not happen.
end

Related

How to mark a sidekiq task/job for retry without raising an error?

I use a Sidekiq queue to process communications with an unreliable, 3rd party API. Since this API is often down for a couple minutes at a time and then back up again, Sidekiq has been handy. When a connection issue happens, an error is raised and Sidekiq throws the job back in the queue to be retried again later, after some time has passed.
I use NewRelic to not only help debug crashes, but also for monitoring. My problem is that this current methodology above creates errors in NewRelic. If the 3rd party API is down for more than a couple of minutes, the error count accumulates enough to cause notifications to send out through NewRelic.
What I'd like to do is only raise an error from my worker when a certain number of retries have occurred for a job. I'm using sidekiq_retries_exhausted to do this. My problem is that I'm not quite sure how to put jobs back in the queue after they have an error without raising an error.
Does Sidekiq provide any facilities to return a job to a queue, increment the number of retries for the job, and have it sit there until it's due to run again, as if an exception was raised in the worker class?
You raise a specific error and tell the error service to ignore errors of that type. For NewRelic:
https://docs.newrelic.com/docs/agents/ruby-agent/installation-configuration/ruby-agent-configuration#error_collector.ignore_errors
Here is what I did to keep intentional retry errors out of AirBrake:
class TaskWorker
include Sidekiq::Worker
class RetryNotAnError < RuntimeError
end
def perform task_id
task = Task.find(task_id)
task.do_cool_stuff
if task.finished?
#log.debug "Task #{task_id} was successful."
return false
else
#log.debug "Task #{task_id} will try again later."
raise RetryNotAnError, task_id
end
end
end
Tell Airbrake to ignore it:
Airbrake.configure do |config|
config.ignore << 'RetryNotAnError'
end
It's good to make your exception name OBVIOUSLY not an error (e.g. RetryLaterNotAnError), as it will still show up in logs and such, and you don't want to freak people out when they see a bunch of them.
ps. That said, I would really like to see Sidekiq to provide an explicit, errorless retry mechanism.
If using Sidekiq Enterprise, one other option might be to utilize the optional set of additional error types that will then get treated as Sidekiq::Limiter::OverLimit violations.
For my purposes, I've used a new error class and then added it to the list in the config. Here are the notes from the sidekiq-ent code (not in the public sidekiq repo) on how to modify your config file:
# An optional set of additional error types which would be
# treated as a rate limit violation, so the job would automatically
# be rescheduled as with Sidekiq::Limiter::OverLimit.
#
# Sidekiq::Limiter.errors << MyApp::TooMuch
# Sidekiq::Limiter.errors = [Foo::Error, MyApp::Limited]
Inside the specific job you can specify the max_retries, or it will default to 20:
sidekiq_options max_limiter_retries: 10
Inside the job, I'll rescue the "expected" intermittent error that I'd rather not ignore completely and then raise the error I've added to the list, something like this:
rescue RestClient::RequestTimeout => e
raise SidekiqSoftRetry.new(e.inspect)
end
Here's what that looks like in my initialization file-- and Mike Perham was kind enough to respond with the option to update the global retry limit.
class SidekiqSoftRetry < RuntimeError
end
Sidekiq::Limiter::DEFAULT_OPTIONS[:reschedule] = 10
Sidekiq::Limiter.configure do |config|
config.errors.concat(
[
SidekiqSoftRetry,
]
)
end

Raise exception if RSpec fails

I'd like to write rake script, that runs all RSpec tests in my application. If any of the tests fails, id like to throw an Exception in the task (later on I will catch this exception in NewRelic alert system - I use it for other tasks as well).
Is it possible?
First, you don't need to raise an exception to let newrelic know of it. You can let newrelic know by directly posting error details to their api: https://docs.newrelic.com/docs/agents/ruby-agent/troubleshooting/sending-new-relic-handled-errors
notice_error(exception, options = { })
where exception can be an exception object (StandardError.new, for example) or a message.
Also, you can omit all this exception business and check exit code of rspec command line tool. If tests are green, it'll be zero. If errors are present, it will not be zero. Something like this
if system('rspec spec') # return true if command was successful, false otherwise
# if green
else
# if red
end

How to retry a rake task if you get a Bad Gateway error response from a web source

I am trying to run a rake task to get all the data with a specific tag from Instagram, and then input some of the data into my server.
The task runs just fine, except sometimes I'll get an error response. It's sort of random, so I think it just happens sometimes, and since it's a fairly long running task, it'll happen eventually.
This is the error on my console:
Instagram::BadGateway: GET https://api.instagram.com/v1/tags/xxx/media/recent.json?access_token=xxxxx&max_id=996890856542960826: 502: The server returned an invalid or incomplete response.
When this happens, I don't know what else to do except run the task again starting from that max_id. However, it would be nice if I could get the whole thing to automate itself, and retry itself from that point when it gets that error.
My task looks something like this:
task :download => :environment do
igs = Instagram.tag_recent_media("xxx")
begin
sleep 0.2
igs.each do |ig|
dl = Instadownload.new
dl.instagram_url = ig.link
dl.image_url = ig.images.standard_resolution.url
dl.caption = ig.caption.text if ig.caption
dl.taken_at = Time.at(ig.created_time.to_i)
dl.save!
end
if igs.pagination.next_max_id?
igs = Instagram.tag_recent_media("xxx", max_id: igs.pagination.next_max_id)
moreigs = true
else
moreigs = false
end
end while moreigs
end
Chad Pytel and Tammer Saleh call this "Fire and forget" antipattern in their Rails Antipatterns book:
Assuming that the request always succeeds or simply not caring if it
fails may be valid in rare circumstances, but in most cases it's
unsufficient. On the other hand, rescuing all the exceptions would be
a bad practice aswell. The proper solution would be to understand the
actual exceptions that will be raised by external service and rescue
those only.
So, what you should do is to wrap your code block into begin/rescue block with the appropriate set of errors raised by Instagram (list of those errors can be found here). I'm not sure which particular line of your code snippet ends with 502 code, so just to give you and idea of what it could look like:
begin
dl = Instadownload.new
dl.instagram_url = ig.link
dl.image_url = ig.images.standard_resolution.url
dl.caption = ig.caption.text if ig.caption
dl.taken_at = Time.at(ig.created_time.to_i)
dl.save!
rescue Instagram::BadGateway => e # list of acceptable errors can be expanded
retry # restart from beginning
end

Check if Nokogiri HTML document is usable

I want to check if the URL that the user inputs is in fact a valid page.
I tried:
if Nokogiri::HTML(open("http://example.com"))
#DO REQUIRED TASK
end
But that immediately throws an error upon attempting to open the page. I want to return the result of whether it is a document of any kind.
I either get the error:
no such file or directory
or:
getaddrinfo: Name or service not known
depending on how I try to make the check.
I'd start with something like:
require 'nokogiri'
require 'open-uri'
begin
doc = Nokogiri.HTML(open(url))
rescue Exception => e
puts "Couldn't read \"#{ url }\": #{ e }"
exit
end
puts (doc.errors.empty?) ? "No problems found" : doc.errors
Nokogiri sets the document's errors array to the values of any errors that occurred during the parsing process.
This only addresses one part of the issue though. Malicious people like to break things, and this would be very easy to break. In general, be very careful about anything a user gives you, especially if your site is exposed to the wild internet.
Prior to telling OpenURI to load the file to give to Nokogiri, you should sniff that URL and do some sanity checks using a HTTP HEAD request to find out the size and MIME-TYPE of the content being retrieved. Once you know those, you can try loading the file.
Firstly, it's bad style to 'rescue Exception => e' in Ruby.
[Refer: http://daniel.fone.net.nz/blog/2013/05/28/why-you-should-never-rescue-exception-in-ruby/ ]
Secondly, for this case, "rescue OpenURI::HTTPError => e" would be more suitable.
I'm not familiar with handling exceptions but something like :
begin
page = Nokogiri::HTML(open("http://example.com"))
ensure
puts "not a document of any kind"
end
do_something_whith(page) if page
...should do the trick.
or (after reading your comment) :
begin
page = open("http://example.com")
ensure
puts "not a document of any kind"
end
Nokogiri::HTML(page) if page

Does Ruby's 'open_uri' reliably close sockets after read or on fail?

I have been using open_uri to pull down an ftp path as a data source for some time, but suddenly found that I'm getting nearly continual "530 Sorry, the maximum number of allowed clients (95) are already connected."
I am not sure if my code is faulty or if it is someone else who's accessing the server and unfortunately there's no way for me to really seemingly know for sure who's at fault.
Essentially I am reading FTP URI's with:
def self.read_uri(uri)
begin
uri = open(uri).read
uri == "Error" ? nil : uri
rescue OpenURI::HTTPError
nil
end
end
I'm guessing that I need to add some additional error handling code in here...
I want to be sure that I take every precaution to close down all connections so that my connections are not the problem in question, however I thought that open_uri + read would take this precaution vs using the Net::FTP methods.
The bottom line is I've got to be 100% sure that these connections are being closed and I don't somehow have a bunch open connections laying around.
Can someone please advise as to correctly using read_uri to pull in ftp with a guarantee that it's closing the connection? Or should I shift the logic over to Net::FTP which could yield more control over the situation if open_uri is not robust enough?
If I do need to use the Net::FTP methods instead, is there a read method that I should be familiar with vs pulling it down to a tmp location and then reading it (as I'd much prefer to keep it in a buffer vs the fs if possible)?
I suspect you are not closing the handles. OpenURI's docs start with this comment:
It is possible to open http/https/ftp URL as usual like opening a file:
open("http://www.ruby-lang.org/") {|f|
f.each_line {|line| p line}
}
I looked at the source and the open_uri method does close the stream if you pass a block, so, tweaking the above example to fit your code:
uri = ''
open("http://www.ruby-lang.org/") {|f|
uri = f.read
}
Should get you close to what you want.
Here's one way to handle exceptions:
# The list of URLs to pass in to check if one times out or is refused.
urls = %w[
http://www.ruby-lang.org/
http://www2.ruby-lang.org/
]
# the method
def self.read_uri(urls)
content = ''
open(urls.shift) { |f| content = f.read }
content == "Error" ? nil : content
rescue OpenURI::HTTPError
retry if (urls.any?)
nil
end
Try using a block:
data = open(uri){|f| f.read}

Resources