How to test for asynchronous HTTP requests in ruby using EventMachine - ruby-on-rails

I'm getting messages of a RabbitMQ queue and each message is a URL that I want to make a request to. Now I'm using the AMQP gem to subscribe to the queue and that uses EventMachine, so I'm using the the em-http-request library to make the http requests. According to the documentation here: https://github.com/igrigorik/em-http-request/wiki/Parallel-Requests
The following will issue asynchronous http-requests:
EventMachine.run {
http1 = EventMachine::HttpRequest.new('http://google.com/').get
http2 = EventMachine::HttpRequest.new('http://yahoo.com/').get
http1.callback { }
http2.callback { }
end
So when I subscribe to the RabbitMQ queue I have the following code:
x = 0
EventMachine.run do
connection = AMQP.connect(:host => '127.0.0.1')
channel = AMQP::Channel.new(connection)
channel.prefetch(50)
queue = channel.queue("http.requests")
exchange = channel.direct("")
queue.subscribe do |metadata, payload|
url = payload.inspect
eval "
#http#{x} = EventMachine::HttpRequest.new(url).get
#http#{x}.callback do
puts \"got a response\"
puts #http#{x}.response
end
x = x+1
"
end
end
This dynamically creates new variables and creates new http requests, similar to the way described in the em-http-request documentation. But is there a way to test whether the requests are actually being made asynchronously? Is it possible to write to the console every time a get request is fired off so I can see they are fired off one after the other without waiting for a response?

You can try running tcpdump and analysing the output. If you see the TCP three-way handshakes for the two connections being interleaved then the connections are happening in parallel.
This can't really be part of an automated test though, if that's what you're trying to aim for. I would be happy just to verify that the library does what it says it does once and not make it part of a test suite.

A very simple example, demonstrating exactly what you want:
require 'em-http-request'
EM.run do
# http://catnap.herokuapp.com/3 delays the HTTP response by 3 seconds.
http1 = EventMachine::HttpRequest.new('http://catnap.herokuapp.com/3').get
http1.callback { puts 'callback 1' }
http1
puts 'fired 1'
http2 = EventMachine::HttpRequest.new('https://www.google.com/').get
http2.callback { puts 'callback 2' }
puts 'fired 2'
end
Output (for me):
fired 1
fired 2
callback 2
callback 1
Depending on your internet connection, Heroku and Google, the response to the second HTTP request will likely come in first and you can be sure, the requests are indeed done in parallel.

Related

Rails - multiple theads to avoid the slack 3 second API response rule

I am working with the slack API. My script does a bunch of external processing and in some cases it can take around 3-6 seconds. What is happening is the Slack API expects a 200 response within 3 seconds and because my function is not finished within 3 seconds, it retries again and then it ends up posting the same automated responses 2-3 times.
I confirmed this by commenting out all the functions and I had no issue, it posted the responses to slack fine. I then added sleep 10 and it done the same responses 3 times so the ohly thing different was it took longer.
From what I read, I need to have threaded responses. I then need to first respond to the slack API in thread 1 and then go about processing my functions.
Here is what I tried:
def events
Thread.new do
json = {
"text": "Here is your 200 response immediately slack",
}
render(json: json)
end
puts "--------------------------------Json response started----------------------"
sleep 30
puts "--------------------------------Json response completed----------------------"
puts "this is a successful response"
end
When I tested it the same issue happened so I tried using an online API tester and it hits the page, waits 30 seconds and then returns the 200 response but I need it to respond immediately with the 200, THEN process the rest otherwise I will get duplicates.
Am I using threads properly or is there another way to get around this Slack API 3 second response limit? I am new to both rails and slack API so a bit lost here.
Appreciate the eyes :)
I would recommend using ActionJob to run the code in the background if you don't need to use the result of the code in the response. First, create an ActiveJob job by running:
bin/rails generate job do_stuff
And then open up the file created in app/jobs/do_stuff_job.rb and edit the #perform function to include your code (so the puts statements and sleep 30 in your example). Finally, from the controller action you can call DoStuff.perform_later and your job will run in the background! Your final controller action will look something like this:
def events
DoStuff.perform_later # this schedules DoStuff to be done later, in
# the background, so it will return immediately
# and continue to the next line.
json = {
"text": "Here is your 200 response immediately slack",
}
render(json: json)
end
As an aside, I'd highly recommend never using Thread.new in rails. It can create some really confusing behavior especially in test scripts for a number of reasons, but usually because of how it interacts with open connections and specifically ActiveRecord.

Rails: How to listen to / pull from service or queue?

Most Rails applications work in a way that they are waiting for requests comming from a client and then do their magic.
But if I want to use a Rails application as part of a microservice architecture (for example) with some asychonious communication (Serivce A sends an event into a Kafka or RabbitMQ queue and Service B - my Rails app - is supposed to listen to this queue), how can I tune/start the Rails app to immediately listen to a queue and being triggered by event from there? (Meaning the initial trigger is not comming from a client, but from the App itself.)
Thanks for your advice!
I just set up RabbitMQ messaging within my application and will be implementing for decoupled (multiple, distributed) applications in the next day or so. I found this article very helpful (and the RabbitMQ tutorials, too). All the code below is for RabbitMQ and assumes you have a RabbitMQ server up and running on your local machine.
Here's what I have so far - that's working for me:
#Gemfile
gem 'bunny'
gem 'sneakers'
I have a Publisher that sends to the queue:
# app/agents/messaging/publisher.rb
module Messaging
class Publisher
class << self
def publish(args)
connection = Bunny.new
connection.start
channel = connection.create_channel
queue_name = "#{args.keys.first.to_s.pluralize}_queue"
queue = channel.queue(queue_name, durable: true)
channel.default_exchange.publish(args[args.keys.first].to_json, :routing_key => queue.name)
puts "in #{self}.#{__method__}, [x] Sent #{args}!"
connection.close
end
end
end
end
Which I use like this:
Messaging::Publisher.publish(event: {... event details...})
Then I have my 'listener':
# app/agents/messaging/events_queue_receiver.rb
require_dependency "#{Rails.root.join('app','agents','messaging','events_agent')}"
module Messaging
class EventsQueueReceiver
include Sneakers::Worker
from_queue :events_queue, env: nil
def work(msg)
logger.info msg
response = Messaging::EventsAgent.distribute(JSON.parse(msg).with_indifferent_access)
ack! if response[:success]
end
end
end
The 'listener' sends the message to Messaging::EventsAgent.distribute, which is like this:
# app/agents/messaging/events_agent.rb
require_dependency #{Rails.root.join('app','agents','fsm','state_assignment_agent')}"
module Messaging
class EventsAgent
EVENT_HANDLERS = {
enroll_in_program: ["FSM::StateAssignmentAgent"]
}
class << self
def publish(event)
Messaging::Publisher.publish(event: event)
end
def distribute(event)
puts "in #{self}.#{__method__}, message"
if event[:handler]
puts "in #{self}.#{__method__}, event[:handler: #{event[:handler}"
event[:handler].constantize.handle_event(event)
else
event_name = event[:event_name].to_sym
EVENT_HANDLERS[event_name].each do |handler|
event[:handler] = handler
publish(event)
end
end
return {success: true}
end
end
end
end
Following the instructions on Codetunes, I have:
# Rakefile
# Add your own tasks in files placed in lib/tasks ending in .rake,
# for example lib/tasks/capistrano.rake, and they will automatically be available to Rake.
require File.expand_path('../config/application', __FILE__)
require 'sneakers/tasks'
Rails.application.load_tasks
And:
# app/config/sneakers.rb
Sneakers.configure({})
Sneakers.logger.level = Logger::INFO # the default DEBUG is too noisy
I open two console windows. In the first, I say (to get my listener running):
$ WORKERS=Messaging::EventsQueueReceiver rake sneakers:run
... a bunch of start up info
2016-03-18T14:16:42Z p-5877 t-14d03e INFO: Heartbeat interval used (in seconds): 2
2016-03-18T14:16:42Z p-5899 t-14d03e INFO: Heartbeat interval used (in seconds): 2
2016-03-18T14:16:42Z p-5922 t-14d03e INFO: Heartbeat interval used (in seconds): 2
2016-03-18T14:16:42Z p-5944 t-14d03e INFO: Heartbeat interval used (in seconds): 2
In the second, I say:
$ rails s --sandbox
2.1.2 :001 > Messaging::Publisher.publish({:event=>{:event_name=>"enroll_in_program", :program_system_name=>"aha_chh", :person_id=>1}})
in Messaging::Publisher.publish, [x] Sent {:event=>{:event_name=>"enroll_in_program", :program_system_name=>"aha_chh", :person_id=>1}}!
=> :closed
Then, back in my first window, I see:
2016-03-18T14:17:44Z p-5877 t-19nfxy INFO: {"event_name":"enroll_in_program","program_system_name":"aha_chh","person_id":1}
in Messaging::EventsAgent.distribute, message
in Messaging::EventsAgent.distribute, event[:handler]: FSM::StateAssignmentAgent
And in my RabbitMQ server, I see:
It's a pretty minimal setup and I'm sure I'll be learning a lot more in coming days.
Good luck!
I'm afraid that for RabbitMQ at least you will need a client. RabbitMQ implements the AMQP protocol, as opposed to the HTTP protocol used by web servers. As Sergio mentioned above, Rails is a web framework, so it doesn't have AMQP support built into it. You'll have to use an AMQP client such as Bunny in order to subscribe to a Rabbit queue from within a Rails app.
Lets say Service A is sending some events to Kafka queue, you can have a background process running with your Rails app which would lookup into the kafka queue and process those queued messages. For background process you can go for cron-job or sidekiq kind of things.
Rails is a lot of things. Parts of it handle web requests. Other parts (ActiveRecord) don't care if you are a web request or a script or whatever. Rails itself does not even come with a production worthy web server, you use other gems (e.g., thin for plain old web browsers, or wash_out for incoming SOAP requests) for that. Rails only gives you the infrastructure/middleware to combine all the pieces regarding servers.
Unless your queue can call out to your application in some fashion of HTTP, for example in the form of SOAP requests, you'll need something that listens to your queueing system, whatever that may be, and translates new "tickets" on your queue into controller actions in your Rails world.

How would I use rspec to test a method who's job is to post to a webhook?

I'm using rspec to test my application and I'm having a hard time figuring out how to test this. The Slack::Notifier's job is to send a post request to a webhook. Once I call this method in Rspec, I don't know how to see the response. Also, is it possible to match the format of this text to an expected text somewhere? My method is below. Thanks.
def notify
offset = 14400 #UTC to EST
notifier = Slack::Notifier.new Rails.application.secrets.slack_organization_name, Rails.application.secrets.slack_token, channel: "##{Rails.application.secrets.slack_channel}", username: Rails.application.secrets.slack_user_name
notifier.ping(":white_check_mark: *USAGE SUMMARY for #{(Time.now - offset).to_formatted_s(:long) }*")
count = 0
current_time = Time.now.to_i
live_response.each do |r|
if r["properties"]["time"] > ((current_time - offset) - 60) #&& r["properties"]["$initial_referring_domain"] == "capture.com"
notifier.ping("
*Name:* #{r["properties"]["$name"]}
*Event:* #{r["event"]}
*Keywords:* #{r["properties"]["keywords"]}
*Organization:* #{r["properties"]["organizationName"]}
*Email:* #{r["properties"]["$email"]}
*Time:* #{Time.at(r["properties"]["time"] + offset).utc.to_datetime.in_time_zone("Eastern Time (US & Canada)").to_formatted_s(:long_ordinal)}
*More Data:* #{ANALYTICS_URL}#{r["properties"]["distinct_id"]}
__________________________________________________
")
count +=1
end
end
notifier.ping("*There were #{count} events in this report.*")
end
Testing network communications (like API calls) is a tricky thing. Personally I would rely on programming by contract and testing in isolation - i.e. assume the external service is working fine and it responds positively for valid request.
Then you test your client code by checking that you are actually sending a valid request. For this stub the method where control exits your code into a library/system code. For example if you are making a HTTP GET request using a gem like HTTParty, then stub HTTParty.get i.e. HTTParty.stub(:get) and in that stub verify that correct parameters were sent.
On the other side of the spectrum you should also simulated both positive and negative responses from the web service and make sure your client code handles it in expected manner.
If you are making a real then you are introducing a lot of dependencies on your test : a test setup of external service, risk of network issues (timeout, n/w breakdown, etc) problems with external service and may be more.
If you yourself are writing that webservice too then test that one also in isolation, i.e by simulating valid and invalid inputs and making sure they are handled properly. This part is pretty much your controller specs or request specs.
Once again, this is my opinion. Suggestions to do this in a better way and constructive criticism on the shortcomings of this approach are definitely welcome.

Optimal way to structure polling external service (RoR)

I have a Rails application that has a Document with the flag available. The document is uploaded to an external server where it is not immediately available (takes time to propogate). What I'd like to do is poll the availability and update the model when available.
I'm looking for the most performant solution for this process (service does not offer callbacks):
Document is uploaded to app
app uploads to external server
app polls url (http://external.server.com/document.pdf) until available
app updates model Document.available = true
I'm stuck on 3. I'm already using sidekiq in my project. Is that an option, or should I use a completely different approach (cron job).
Documents will be uploaded all the time and so it seems relevant to first poll the database/redis to check for Documents which are not available.
See this answer: Making HTTP HEAD request with timeout in Ruby
Basically you set up a HEAD request for the known url and then asynchronously loop until you get a 200 back (with a 5 second delay between iterations, or whatever).
Do this from your controller after the document is uploaded:
Document.delay.poll_for_finished(#document.id)
And then in your document model:
def self.poll_for_finished(document_id)
document = Document.find(document_id)
# make sure the document exists and should be polled for
return unless document.continue_polling?
if document.remote_document_exists?
document.available = true
else
document.poll_attempts += 1 # assumes you care how many times you've checked, could be ignored.
Document.delay_for(5.seconds).poll_for_finished(document.id)
end
document.save
end
def continue_polling?
# this can be more or less sophisticated
return !document.available || document.poll_attempts < 5
end
def remote_document_exists?
Net::HTTP.start('http://external.server.com') do |http|
http.open_timeout = 2
http.read_timeout = 2
return "200" == http.head(document.path).code
end
end
This is still a blocking operation. Opening the Net::HTTP connection will block if the server you're trying to contact is slow or unresponsive. If you're worried about it use Typhoeus. See this answer for details: What is the preferred way of performing non blocking I/O in Ruby?

Any idea why requests to vertx embedded in grails are synchronously queued up

Environment: Mac osx lion
Grails version: 2.1.0
Java: 1.7.0_08-ea
If I start up vertx in embedded mode within Bootstrap.groovy and try to hit the same websocket endpoint through multiple browsers, the requests get queued up.
So depending on the timing of the requests, after one request is done with its execution the next request gets into the handler.
I've tried this with both websocket and SockJs and noticed the same behavior on both.
BootStrap.groovy (SockJs):
def vertx = Vertx.newVertx()
def server = vertx.createHttpServer()
def sockJSServer = vertx.createSockJSServer(server)
def config = ["prefix": "/eventbus"]
sockJSServer.installApp(config) { sock ->
sleep(10000)
}
server.listen(8088)
javascript:
<script>
function initializeSocket(message) {
console.log('initializing web socket');
var socket = new SockJS("http://localhost:8088/eventbus");
socket.onmessage = function(event) {
console.log("received message");
}
socket.onopen = function() {
console.log("start socket");
socket.send(message);
}
socket.onclose = function() {
console.log("closing socket");
}
}
OR
BootStrap.groovy (Websockets):
def vertx = Vertx.newVertx()
def server = vertx.createHttpServer()
server.setAcceptBacklog(10000);
server.websocketHandler { ws ->
println('**received websocket request')
sleep(10000)
}.listen(8088)
javascript
socket = new WebSocket("ws://localhost:8088/ffff");
socket.onmessage = function(event) {
console.log("message received");
}
socket.onopen = function() {
console.log("socket opened")
socket.send(message);
}
socket.onclose = function() {
console.log("closing socket")
}
From the helpful folks at vertx:
def server = vertx.createHttpServer() is actually a verticle and a verticle is a single threaded process
As bluesman says, each verticle goes in its own thread. You can span your verticles across cores in your hardware, even clustering them with more machines. But this add capacity to accept simultaneous requests.
When programming realtime apps, we should try to build the response as soon as posible to avoid blocking. If you think your operation can be time intensive, consider this model:
Make a request
Pass the task to a worker verticle and assign this task an UUID (for example), and put it into response. The caller now knows that the work is in progress and receive the response so fast
When the worker ends the task, put a notification in event bus using the UUID assigned.
The caller check the event bus for the task result.
This is tipically done in a web application vía websockets, sockjs, etc.
This way you can accept thousands of request without blocking. And clients will receive the result without blocking the UI.
Vert.x use the JVM to create a so called "multi-reactor pattern", that is a reactor pattern modified to perform better.
As far as I understood is not true that each verticle has its own thread: the fact is that each verticle is always served by the same event loop, but more verticles can be binded with the same event loop and there can be multiple event loops. An event loop is basically a thread, so few threads should serve many verticles.
I didn't use vert.x in embedded mode (and I don't know if the main concept change) but you should perform much better instantiating many verticles for the job
Regards,
Carlo
As mentioned before Vertx concept is based on reactor pattern which means the single instance has at least one single-threaded event loop and processes events sequentially. Now the request processing may consist of several events, the point here is to serve the request and each event with non-blocking routines.
E.g. when you wait for Web Socket message the request should be suspended and in the event of message it is woken back. Whatever you do with the message should be also non-blocking thus asynchronous, like any file IO, networking IO, DB access. Vertx provides basic elements which you should use to build such async flow: Buffers, Pumps, Timers, EventBus.
To wrap it up - just never block. The use of sleep(10000) kills the concept. If you really need to halt the execution use VertX's Timers instead.

Resources