Rails ping website and evaluate Net::HTTP response - ruby-on-rails

I'm making a simple way to check is a site is up or not, this is my Ping model that holds a few adresses I want to check
require 'net/http'
def self.check
pings = Ping.all
pings.each do |p|
http = Net::HTTP.new(p.address,80)
response = http.request_get('/')
if response.message == ( 'OK' or 'Found')
puts 'up!!'
end
end
end
I'm just checking if the response message is "OK" or "Found" but my or statement only checks for "OK".
Also is this a good way to check?

response.message == ( 'OK' or 'Found') isn't going to do what you think.
> ( 'OK' or 'Found')
=> "OK"
This is why it's only checking for "OK". IMHO you shouldn't check the message as that could vary. Check the response code for anything in the 200 or 300 range.
As long as you aren't checking a lot of sites the above is fine. If you were, it might begin to take awhile as they are sequential.
You might also want to add a timeout so that if it's trying to ping and fails for a 20 seconds consider the site down.
Also I know for a fact that some websites will return 400 errors for unrecognized user agents. So what you see via your browser might not be at all what your script is going too.

As pointed out by #PhilipHallstrom, your if statement isn't doing what you think it's doing. (if a== (b or c) != if a == b || a == c ... Also, there is the issue of case sensitive equality.
Try a Regexp with the case insensitive configuration (the i at the end):
response.message.match /^(ok|found)$/i
... although...
I support #PhilipHallstrom's reservations about the code.
Numerical codes would probably be better and application responsiveness should be a factor in the design (I assume you have that one covered)...
I would probably go for:
response.code >= 200 && response.code < 400

Related

Stripe API auto_paging get all Stripe::BalanceTransaction except some charge

I'm trying to get all Stripe::BalanceTransaction except those they are already in my JsonStripeEvent
What I did =>
def perform(*args)
last_recorded_txt = REDIS.get('last_recorded_stripe_txn_last')
txns = Stripe::BalanceTransaction.all(limit: 100, expand: ['data.source', 'data.source.application_fee'], ending_before: last_recorded_txt)
REDIS.set('last_recorded_stripe_txn_last', txns.data[0].id) unless txns.data.empty?
txns.auto_paging_each do |txn|
if txn.type.eql?('charge') || txn.type.eql?('payment')
begin
JsonStripeEvent.create(data: txn.to_json)
rescue StandardError => e
Rails.logger.error "Error while saving data from stripe #{e}"
REDIS.set('last_recorded_stripe_txn_last', txn.id)
break
end
end
end
end
But It doesnt get the new one from the API.
Can anyone could help me for this ? :)
Thanks
I think it's because the way auto_paging_each works is almost opposite to what you expect :)
As you can see from its source, auto_paging_each calls Stripe::ListObject#next_page, which is implemented as follows:
def next_page(params={}, opts={})
return self.class.empty_list(opts) if !has_more
last_id = data.last.id
params = filters.merge({
:starting_after => last_id,
}).merge(params)
list(params, opts)
end
It simply takes the last (already fetched) item and adds its id as the starting_after filter.
So what happens:
You fetch 100 "latest" (let's say) records, ordered by descending date (default order for BalanceTransaction API according to Stripe docs)
When you call auto_paging_each on this dataset then, it takes the last record, adds its id as the
starting_after filter and repeats the query.
The repeated query returns nothing because there are noting newer (starting later) than the set you initially fetched.
As far as there are no more newer items available, the iteration stops after the first step
What you could do here:
First of all, ensure that my hypothesis is correct :) - put the breakpoint(s) inside Stripe::ListObject and check. Then 1) rewrite your code to use starting_after traversing logic instead of ending_before - it should work fine with auto_paging_each then - or 2) rewrite your code to control the fetching order manually.
Personally, I'd vote for (2): for me slightly more verbose (probably), but straightforward and "visible" control flow is better than poorly documented magic.

How to prevent rollbar from reporting SEO crawlers activities?

I have setup rollbar in my rails application. It keeps reporting recordnotfound which is as a result of SEO scrawlers (i.e Google bot, Baidu, findxbot etc..) searching for deleted post.
How to prevent rollbar from reporting SEO scrawler activities.
TL;DR:
# ./initializers/rollbar.rb
#
# https://stackoverflow.com/questions/36588449/how-to-prevent-rollbar-from-reporting-seo-crawlers-activities
#
# frozen_string_literal: true
crawlers = %w[Facebot Twitterbot YandexBot bingbot AhrefsBot crawler MJ12bot Yahoo GoogleBot Mail.RU_Bot SemrushBot YandexMobileBot DotBot AppleMail SeznamBot Baiduspider]
regexp = Regexp.new(Regexp.union(*crawlers).source, Regexp::IGNORECASE)
Rollbar.configure do |config|
ignore_bots = lambda do |options|
agent = options.fetch(:scope).fetch(:request).call.fetch(:headers)['User-Agent']
raise Rollbar::Ignore if agent.match?(regexp)
end
config.before_process << ignore_bots
...
end
======================
Be careful with magic comment frozen_string_literal and use =~ instead of match? if you have Ruby version less than 2.3.
Here I use an array that will be transformed into regexp. I did this because I wanted to prevent syntax and escaping related errors of developers in future and add ignorecase thing for same reason.
So in regexp you will see a Mail\.RU_Bot, instead of anything wrong.
Also in your case you can use simply word bot instead of many crawlers, but be careful with unusual user-agents. In my case, I want to know all crawlers on my site, so I came up with this solution. Yet another example of working part: there are crawler and crawler4j on my production site. I use just crawler in array to prevent notifing for both of them.
Last thing I want to say — my solution is not very optimal, but it just works. I hope someone will share an optimized version of my code. That's also the main reason I recommend to send data asynchronously, i.e. use sidekiq, delayed_job or whatever you want, don't forget to check related wikis.
My answer is based on #AndrewSouthpaw's solution (?), that wasn't working for me. Hoping that approved wiki-copy-pasted #Jesse Gibbs will be moderated some way.
=======
EDIT1: it's nice idea to check the https://github.com/ZLevine/rollbar-ignore-crawler-errors repo if you need to prevent rollbar to notify on js.
Looks like you are using rollbar-gem, so you'd want to use Rollbar::Ignore to tell Rollbar to ignore errors that were caused by a spider
handler = proc do |options|
raise Rollbar::Ignore if is_crawler_error(options)
end
Rollbar.configure do |config|
config.before_process << handler
end
where is_crawler_error detects if the request that led to the error was from a crawler.
If you are using rollbar.js to detect errors in client-side Javascript, then you can use the checkIgnore option to filter out client-side errors caused by bots:
_rollbarConfig = {
// current config...
checkIgnore: function(isUncaught, args, payload) {
if (window.navigator.userAgent && window.navigator.userAgent.indexOf('Baiduspider') !== -1) {
// ignore baidu spider
return true;
}
// no other ignores
return false;
}
}
Here's what I did:
is_crawler_error = Proc.new do |options|
return true if options[:scope][:request]['From'] == 'bingbot(at)microsoft.com'
return true if options[:scope][:request]['From'] == 'googlebot(at)googlebot.com'
return true if options[:scope][:request]['User-Agent'] =~ /Facebot Twitterbot/
end
handler = proc do |options|
raise Rollbar::Ignore if is_crawler_error.call(options)
end
config.before_process << handler
Based on these docs.

Rails 4 ActiveModel won't update_columns when tested with RSpec

In a normal test using human and browser, everything is work as expected. However, when I use rspec, I can see that I have:
D, [2014-08-16T13:48:09.510013 #19418] DEBUG -- : SQL (0.6ms) UPDATE "system_flights_cacheds" SET "client_stuff" = '{"captcha":"656556"}' WHERE "system_flights_cacheds"."guid" = '5647046e-4194-498e-a0d7-512614b147d8'
But I cannot believe that actually my database record is not updated. Previously I used .save, but with no success in fact it creates SAVEPOINT.
My code in trouble is basically an API endpoint:
cache = System::Flights::Cached.search_cache options
# update database, when the captcha is present. this way, the worker
# when updating the database can see the changes and act accordingly!
if cache && params[:captcha]
# remember, anyone can (basically) see the captcha. thus,
# this is a bit paranoid, only allow captcha update
# if the user is same! in the json, if not forgotten,
# captcha is only displayed when the user_id is equal
server_stuff = cache.server_stuff.with_indifferent_access
if server_stuff[:user_id] == current_user.id
cache.time_renewed = 10
cache.client_stuff_will_change!
cache.client_stuff ||= {}
cache.client_stuff[:captcha] = params[:captcha]
# cache.save!
cache.update_columns(client_stuff: cache.client_stuff)
end
else
# only spawn worker if there is no captcha parameter passed
spawn_search_worker({user_id: current_user.id, options: options})
end
The client can reach this anytime and it will span worker. When a new record is already in database but is_processed is false, the worker will quit. Thus, calling this multiple times will be ok as also be a means to check status if the work is done or not.
The worker will wait the client to enter for a captcha. So, we have class like WaitableLogin, that do basically:
max_repeat = 3 # 14
# annul flag, if set to true, the data will not get persisted.
annul = false
while max_repeat > 0
# interval of 5 secs that worker can check the database
sleep 5
max_repeat -= 1
# break if captcha already entered by client
# seek from the database if the client has posted
# the captcha text
cache = System::Flights::Cached.search_cache options
client_stuff = nil
client_stuff = cache.client_stuff.with_indifferent_access if cache && cache.client_stuff
if client_stuff && client_stuff[:captcha]
captcha_text = client_stuff[:captcha]
airline.fill_captcha(captcha_text).finalize_login
puts "SOMEHOW I AM HERE: #{captcha_text}"
# remove all server's stuff
cache.server_stuff_will_change!
cache.server_stuff.clear
cache.save!
annul = airline.in_login_page?
end
end
So, WaitableLogin will check if the client_stuff is updated. If it is, then we know that client has submitted the captcha (through the Endpoint, the worker will check if captcha is a param and will update the database if there's captcha field).
The control then transferred back to the Worker. You can see that there's a lot of code that use cache at many parts of the codes across files, cache is just variable name nothing to do with its semantic meaning in Rails or whatever.
When I run normally on browser, I don't see any problem. In fact, no SAVEPOINT even if I use .save. I thought, it is creating some bug somewhere with that SAVEPOINT so I decided to try using .update_columns. But, again, with no success.
This is what the test looks like
before(:each) do
System::Flights::Cached.delete_all
end
describe "requests" do
it "should process 2a1c1i" do
cached = nil
post("/api/v1/x.json", {
access_token: CommonFlightData::ACCESS_TOKEN,
business_token: CommonFlightData::BUSINESS_TOKEN,
captcha: ""
}.merge!(CommonFlightData.oneway_1a(from: "8-9-2014")))
puts "enter the captcha: "
captcha = STDIN.gets.chomp
puts "Entered: #{captcha}"
post("/api/v1/x.json", {
access_token: CommonFlightData::ACCESS_TOKEN,
business_token: CommonFlightData::BUSINESS_TOKEN,
captcha: captcha
}.merge!(CommonFlightData.oneway_1a(from: "8-9-2014")))
sleep 10
SO what am I missing at, I tired. No error raised. When I check .inspect after update_columns, it seems all is updated. But, when you see at the database, nothing is updated.
EDIT: I put lock_version so that I have optimistic locking (by default, I think). And turn out, as expected, it was set to 2.
EDIT 2: If I command an edit from a rails console at the time when the code asking for captcha, IT UPDATES the data. SO, why the RSpec spec that run the api endpoint to submit a captcha won't update the row. All real-life no spec-in-origin code is finely executed.

Make Ruby/Rails continue method after encountering error

def checkdomains
#domains = Domain.all
##domains.where(:confirmed => "yes").each do |f|
#domains.each do |f|
r = Whois.whois(f.domain)
if r.available? == true
EmailNotify.notify_email(f).deliver
end
end
end
This method crashes when it comes upon an invalid url (the whois gem gives an error), and doesn't keep on checking the rest of the domains. Is there any way I can have it continue to check the rest of the domains even if it crashes on one? At least until I can sort out phising out each domain.
#domains.each do |f|
begin
r = Whois.whois(f.domain)
if r.available? == true
EmailNotify.notify_email(f).deliver
end
rescue Exception => e
puts "Error #{e}"
next # <= This is what you were looking for
end
end
When you say
crashing out
I assume you mean that you are getting an exception raised. If this is the case then just trap the exception, do what you want with it (Store the address in a bad_email table or whatever) then carry on doing what you are doing. Your log file will tell what exception is being raised so you know what your rescue statement should be
so
begin
r = Whois.whois(f.domain)
if r.available? == true
EmailNotify.notify_email(f).deliver
rescue WhateverException
#do something here like re raise the error or store the email address in a bad_emails table or do both just simply do nothing at all
end
If you are referring to something else like the whole app dying then I haven'ty got a clue and there is not enough info to advise further. Sorry
As jamesw suggests, you can wrap the statements in an exception handler, dealing with them as they occur. Let me suggest further that, wherever your program gets these (possibly invalid) domain names, you validate them as soon as you get them, and throw out the invalid ones. That way, by the time you reach this loop, you already know you're iterating over a list of good domains.
EDIT: For domain name validation, check here.

find_or_create and race-condition in rails, theory and production

Hi I've this piece of code
class Place < ActiveRecord::Base
def self.find_or_create_by_latlon(lat, lon)
place_id = call_external_webapi
result = Place.where(:place_id => place_id).limit(1)
result = Place.create(:place_id => place_id, ... ) if result.empty? #!
result
end
end
Then I'd like to do in another model or controller
p = Post.new
p.place = Place.find_or_create_by_latlon(XXXXX, YYYYY) # race-condition
p.save
But Place.find_or_create_by_latlon takes too much time to get the data if the action executed is create and sometimes in production p.place is nil.
How can I force to wait for the response before execute p.save ?
thanks for your advices
You're right that this is a race condition and it can often be triggered by people who double click submit buttons on forms. What you might do is loop back if you encounter an error.
result = Place.find_by_place_id(...) ||
Place.create(...) ||
Place.find_by_place_id(...)
There are more elegant ways of doing this, but the basic method is here.
I had to deal with a similar problem. In our backend a user is is created from a token if the user doesn't exist. AFTER a user record is already created, a slow API call gets sent to update the users information.
def self.find_or_create_by_facebook_id(facebook_id)
User.find_by_facebook_id(facebook_id) || User.create(facebook_id: facebook_id)
rescue ActiveRecord::RecordNotUnique => e
User.find_by_facebook_id(facebook_id)
end
def self.find_by_token(token)
facebook_id = get_facebook_id_from_token(token)
user = User.find_or_create_by_facebook_id(facebook_id)
if user.unregistered?
user.update_profile_from_facebook
user.mark_as_registered
user.save
end
return user
end
The step of the strategy is to first remove the slow API call (in my case update_profile_from_facebook) from the create method. Because the operation takes so long, you are significantly increasing the chance of duplicate insert operations when you include the operation as part of the call to create.
The second step is to add a unique constraint to your database column to ensure duplicates aren't created.
The final step is to create a function that will catch the RecordNotUnique exception in the rare case where duplicate insert operations are sent to the database.
This may not be the most elegant solution but it worked for us.
I hit this inside a sidekick job that retries and gets the error repeatedly and eventually clears itself. The best explanation I've found is on a blog post here. The gist is that postgres keeps an internally stored value for incrementing the primary key that gets messed up somehow. This rings true for me because I'm setting the primary key and not just using an incremented value so that's likely how this cropped up. The solution from the comments in the link above appears to be to call ActiveRecord::Base.connection.reset_pk_sequence!(table_name) This cleared up the issue for me.
begin
result = Place.where(:place_id => place_id).limit(1)
result = Place.create(:place_id => place_id, ... ) if result.empty? #!
rescue ActiveRecord::StatementInvalid => error
#save_retry_count = (#save_retry_count || 1)
ActiveRecord::Base.connection.reset_pk_sequence!(:place)
retry if( (#save_retry_count -= 1) >= 0 )
raise error
end

Resources