I just wonder how I could handle webhook from third party API in general.
In my case, I need to handle Webhooks from Stripe.
So I use
StripeEvent to handle and listen for the entrance of webhook handlers. It provides an easy to use interface for handling events from Stripe.
The main implementation is:
take the ID from the POSTed event data
stripe doesn't sign events, so to verify by fetching event from Stripe API.
store events (id) and reject IDs that we've seen already to protect against replay attacks.
Everything works so far.
However, let's assume that
handling little complex logic within in webhook hanlder
listening many webhook requests
In this case, I feel I need to consider to use background job.
Best practices in stripe doc
If your webhook script performs complex logic, or makes network calls, it's possible the script would timeout before Stripe sees its complete execution. For that reason, you may want to have your webhook endpoint immediately acknowledge receipt by returning a 2xx HTTP status code, >and then perform the rest of its duties.
Here is my code,
I've just wondered which part I should bundle and enqueue?
StripeEvent.event_retriever = lambda do |params|
return nil if StripeWebhook.exists?(stripe_id: params[:id])
StripeWebhook.create!(stripe_id: params[:id])
return Stripe::Event.construct_from(params.deep_symbolize_keys) if Rails.env.test? # fetching the event from Stripe API
return Stripe::Event.retrieve(params[:id])
end
StripeEvent.configure do |events|
events.subscribe 'invoice.created', InvoiceCreated.new # handling the invoice.created event in service object
events.subscribe 'invoice.payment_succeeded', InvoicePaymentSucceeded.new
...
end
Short answer, just send all of it by serializing the Stripe::Event instance to a string with Marshal::dump, then deserialize back to a Stripe::Event in your background worker with Marshal::load.
We also wanted to process the webhook requests in the background using our delayed_job system, which stores the jobs in a database table and the job arguments in a string column.
To do that we first needed to serialize the Stripe::Event to a string in our StripeEvent.configure block:
# event is an instance of Stripe::Event
serialized_event = Marshal::dump(event)
Then we queue the background job rather than handling it synchronously, passing our serialized event as a string to where it is stored (in our case a database table) to await being processed.
Then our background worker code can deserialize the string it reads back to a Stripe::Event:
event = Marshal::load(serialized_event)
Related
My app has some heavy callback validations when I create a new customer. Basically I check multiple APIs to see if there's a match before creating a new customer record. I don't want this to happen after create, because I'd rather not save the record in the first place if there aren't any matches.
I have a webhook setup that creates a new customer. The problem is that, because my customer validations take so long, the webhook continues to fire because it doesn't get the immediate response.
Here's my Customer model:
validates :shopify_id, uniqueness: true, if: 'shopify_id.present?'
before_validation :get_external_data, :on => :create
def get_external_data
## heavy API calls that I don't want to perform multiple times
end
My hook:
customer = shop.customers.new(:first_name => first_name, :last_name => last_name, :email => email, :shopify_url => shopify_url, :shopify_id => id)
customer.save
head :ok
customer.save is taking about 20 seconds.
To clarify, here's the issue:
Webhook is fired
Heavy API Calls are made
Second Webhook is fired (API calls still being made from first webhook). Runs Heavy API Calls
Third Webhook is fired
This happens until finally the first record is saved so that I can now check to make sure shopify_id is unique
Is there a way around this? How can I defensively program to make sure no duplicate records start to get processed?
What an interesting question, thank you.
Asynchronicity
The main issue here is the dependency on external web hooks.
The latency required to test these will not only impact your save times, but also prevent your server from handling other requests (unless you're using some sort of multi processing).
It's generally not a good idea to have your flow dependent on more than one external resource. In this case, it's legit.
The only real suggestion I have is to make it an asynchronous flow...
--
Asynchronous vs synchronous execution, what does it really mean?
When you execute something synchronously, you wait for it to finish
before moving on to another task. When you execute something
asynchronously, you can move on to another task before it finishes.
In JS, the most famous example of making something asynchronous is to use an Ajax callback... IE sending a request through Ajax, using some sort of "waiting" process to keep user updated, then returning the response.
I would propose implementing this for the front-end. The back-end would have to ensure the server's hands are not tied whilst processing the external API calls. This would either have to be done using some other part of the system (not requiring the use of the web server process), or separating the functionality into some other format.
Ajax
I would most definitely use Ajax on the front-end, or another asynchronous technology (web sockets?).
Either way, when a user creates an account, I would create a "pending" screen. Using ajax is the simplest example of this; however, it is massively limited in scope (IE if the user refreshes the page, he's lost his connection).
Maybe someone could suggest a way to regain state in an asynchronous system?
You could handle it with Ajax callbacks:
#app/views/users/new.html.erb
<%= form_for #user, remote: true do |f| %>
<%= f.text_field ... %>
<%= f.submit %>
<% end %>
#app/assets/javascripts/application.js
$(document).on("ajax:beforeSend", "#new_user", function(xhr, settings){
//start "pending" screen
}).on("ajax:send", "#new_user", function(xhr){
// keep user updated somehow
}).on("ajax:success", "#new_user", function(event, data, status, xhr){
// Remove "pending" screen, show response
});
This will give you a front-end flow which does not jam up the server. IE you can still do "stuff" on the page whilst the request is processing.
--
Queueing
The second part of this will be to do with how your server processes the request.
Specifically, how it deals with the API requests, as they are what are going to be causing the delay.
The only way I can think of at present will be to queue up requests, and have a separate process go through them. The main benefit here being that it will make your Rails app's request asynchronous, instead of having to wait around for the responses to come.
You could use a gem such as Resque to queue the requests (it uses Redis), allowing you to send the request to the Resque queue & capture its response. This response will then form your response to your ajax request.
You'd probably have to set up a temporary user before doing this:
#app/models/user.rb
class User < ActiveRecord::Base
after_create :check_shopify_id
private
def check_shopify_id
#send to resque/redis
end
end
Of course, this is a very high level suggestion. Hopefully it gives you some better perspective.
This is a tricky issue since your customer creation is dependant on an expensive validation. I see a few ways you can mitigate this, but it will be a "lesser of evils" type decision:
Can you pre-call/pre-load the customer list? If so you can cache the list of customers and validate against that instead of querying on each create. This would require a cron job to keep a list of customers updated.
Create the customer and then perform the customer check as a "validation" step. As in, set a validated flag on the customer and then run the check once in a background task. If the customer exists, merge with the existing customer; if not, mark the customer as valid.
Either choice will require work arounds to avoid the expensive calls.
I want to create a callback in my User model. after a user is created, a callback is initiated to run get_followers to get that users twitter followers (via full contact API).
This is all a bit new to me...
Is this the correct approach putting the request in a callback or should it be in the controller somewhere? And then how do I make the request to the endpoint in rails, and where should I be processing the data that is returned?
EDIT... Is something like this okay?
User.rb
require 'open-uri'
require 'json'
class Customer < ActiveRecord::Base
after_create :get_twitter
private
def get_twitter
source = "url-to-parse.com"
#data = JSON.parse(JSON.load(source))
end
A few things to consider:
The callback will run for every Customer that is created, not just those created in the controller. That may or may not be desirable, depending on your specific needs. For example, you will need to handle this in your tests by mocking out the external API call.
Errors could occur in the callback if the service is down, or if a bad response is returned. You have to decide how to handle those errors.
You should consider having the code in the callback run in a background process rather than in the web request, if it is not required to run immediately. That way errors in the callback will not produce a 500 page, and will improve performance since the response can be returned without waiting for the callback to complete. In such a case the rest of the application must be able to handle a user for whom the callback has not yet completed.
I have a longer running task in the background, and how exactly would I let pull status from my background task or would it better somehow to communicate the task completion to my front end?
Background :
Basically my app uses third party service for processing data, so I want this external web service workload not to block all the incoming requests to my website, so I put this call inside a background job (I use sidekiq). And so when this task is done, I was thinking of sending a webhook to a certain controller which will notify the front end that the task is complete.
How can I do this? Is there a better solution for this?
Update:
My app is hosted on heroku
Update II:
I've done some research on the topic and I found out that I can create a seperate app on heroku which will handle this, found this example :
https://github.com/heroku-examples/ruby-websockets-chat-demo
This long running task will be run per user, on a website with a lot of traffic, is this a good idea?
I would implement this using a pub/sub system such as Faye or Pusher. The idea behind this is that you would publish the status of your long running job to a channel, which would then cause all subscribers of that channel to be notified of the status change.
For example, within your job runner you could notify Faye of a status change with something like:
client = Faye::Client.new('http://localhost:9292/')
client.publish('/jobstatus', {id: jobid, status: 'in_progress'})
And then in your front end you can subscribe to that channel using javascript:
var client = new Faye.Client('http://localhost:9292/');
client.subscribe('/jobstatus', function(message) {
alert('the status of job #' + message.jobid + ' changed to ' + message.status);
});
Using a pub/sub system in this way allows you to scale your realtime page events separately from your main app - you could run Faye on another server. You could also go for a hosted (and paid) solution like Pusher, and let them take care of scaling your infrastructure.
It's also worth mentioning that Faye uses the bayeaux protocol, which means it will utilise websockets where it is available, and long-polling where it is not.
We have this pattern and use two different approaches. In both cases background jobs are run with Resque, but you could likely do something similar with DelayedJob or Sidekiq.
Polling
In the polling approach, we have a javascript object on the page that sets a timeout for polling with a URL passed to it from the rails HTML view.
This causes an Ajax ("script") call to the provided URL, which means Rails looks for the JS template. So we use that to respond with state and fire an event for the object to response to when available or not.
This is somewhat complicated and I wouldn't recommend it at this point.
Sockets
The better solution we found was to use WebSockets (with shims). In our case we use PubNub but there are numerous services to handle this. That keeps the polling/open-connection off your web server and is much more cost effective than running the servers needed to handle these connection.
You've stated you are looking for front-end solutions and you can handle all the front-end with PubNub's client JavaScript library.
Here's a rough idea of how we notify PubNub from the backend.
class BackgroundJob
#queue = :some_queue
def perform
// Do some action
end
def after_perform
publish some_state, client_channel
end
private
def publish some_state, client_channel
Pubnub.new(
publish_key: Settings.pubnub.publish_key,
subscribe_key: Settings.pubnub.subscribe_key,
secret_key: Settings.pubnub.secret_key
).publish(
channel: client_channel,
message: some_state.to_json,
http_sync: true
)
end
end
The simplest approach that I can think of is that you set a flag in your DB when the task is complete, and your front-end (view) sends an ajax request periodically to check the flag state in db. In case the flag is set, you take appropriate action in the view. Below are code samples:
Since you suggested that this long running task needs to run per user, so let's add a boolean to users table - task_complete. When you add the job to sidekiq, you can unset the flag:
# Sidekiq worker: app/workers/task.rb
class Task
include Sidekiq::Worker
def perform(user_id)
user = User.find(user_id)
# Long running task code here, which executes per user
user.task_complete = true
user.save!
end
end
# When adding the task to sidekiq queue
user = User.find(params[:id])
# flag would have been set to true by previous execution
# In case it is false, it means sidekiq already has a job entry. We don't need to add it again
if user.task_complete?
Task.perform_async(user.id)
user.task_complete = false
user.save!
end
In the view you can periodically check whether the flag was set using ajax requests:
<script type="text/javascript">
var complete = false;
(function worker() {
$.ajax({
url: 'task/status/<%= #user.id %>',
success: function(data) {
// update the view based on ajax request response in case you need to
},
complete: function() {
// Schedule the next request when the current one's complete, and in case the global variable 'complete' is set to true, we don't need to fire this ajax request again - task is complete.
if(!complete) {
setTimeout(worker, 5000); //in miliseconds
}
}
});
})();
</script>
# status action which returns the status of task
# GET /task/status/:id
def status
#user = User.find(params[:id])
end
# status.js.erb - add view logic based on what you want to achieve, given whether the task is complete or not
<% if #user.task_complete? %>
$('#success').show();
complete = true;
<% else %>
$('#processing').show();
<% end %>
You can set the timeout based on what the average execution time of your task is. Let's say your task takes 10 minutes on average, so their's no point in checking it at a 5sec frequency.
Also in case your task execution frequency is something complex (and not 1 per day), you may want to add a timestamp task_completed_at and base your logic on a combination of the flag and timestamp.
As for this part:
"This long running task will be run per user, on a website with a lot of traffic, is this a good idea?"
I don't see a problem with this approach, though architectural changes like executing jobs (sidekiq workers) on separate hardware will help. These are lightweight ajax calls, and some intelligence built into your javascript (like the global complete flag) will avoid the unnecessary requests. In case you have huge traffic, and DB reads/writes are a concern then you may want to store that flag directly into redis instead (since you already have it for sidekiq). I believe that will resolve your read/write concerns, and I don't see that it is going to cause problems. This is the simplest and cleanest approach I can think of, though you can try achieving the same via websockets, which are supported by most modern browsers (though can cause problems in older versions).
I am unfamiliar with Webhooks but I feel like they are the right thing for my app.
I have the following documentation for FluidSurveys webhook
I understand how I can make the webhook through a POST request to their API, but I don't know how can I tell where the webhook is actually going to send the response.
Can I pass any subscription url I want? e.g. https://www.myapp.com/test and is that where webhook will send the data? Also, after the webhook is created I'm not sure how to ensure my Rails app will receive the response that is initiated.
I assume a controller method that corresponds with the url I provide to the webhook.
If I'm correct on the controller handling the webhook, what would that look like?
Any guidance is appreciated.
Webhooks hook into your app via a callback URL you provide. This is just an action in one of your controllers that responds to POST requests and handles the webhook request. Every time something changes to the remote service, the remote service makes a request to the callback URL you provided, hence triggering the action code.
I'll exemplify with the survey created event. You start by defining a callback action for this event, where you handle the request coming from the webhook. As stated here the webhook responds with the following body:
survey_creator_name=&survey_name=MADE+A+NEW+SURVEY&survey_creator_email=timothy#example.com&survey_url=http%3A%2F%2Fexample.com%2Fsurveys%2Fbob%2Fmade-a-new-survey%2F``
Let's leave the headers for now, they don't contain important information. The available body parameters (survey_creator_name, survey_name etc.) will reflect all details regarding the new survey available on the remote service. So let's write a callback action that handles this request:
class HooksController
def survey_created_callback
# If the body contains the survey_name parameter...
if params[:survery_name].present?
# Create a new Survey object based on the received parameters...
survey = Survey.new(:name => params[:survey_name]
survey.url = params[:survey_url]
survey.creator_email = params[:survey_creator_email]
survey.save!
end
# The webhook doesn't require a response but let's make sure
# we don't send anything
render :nothing => true
end
end
Let's add the route for this (in config/routes.rb):
scope '/hooks', :controller => :hooks do
post :survey_created_callback
end
This will enable the POST /hooks/survey_created_callback route.
Now you'll need to subscribe this callback URL to the Webhooks API. First you'll want to know which hooks are available to you. You do this by placing a GET request at /api/v2/webhooks/. In the response you'll find the event name, survey and collector parameters.
Finally, you subscribe to one of the previously listed hooks by placing a request to the POST /api/v2/webhooks/subscribe/ URL with the following contents:
{
"subscription_url": "http://your-absolute-url.com/hooks/survey_created_callback",
"event": "[EVENT NAME FROM THE HOOKS LIST]",
"survey": "[SURVEY FROM THE HOOKS LIST]",
"collector": "[COLLECTOR FROM THE HOOKS LIST]"
}
The response to this will be a code 201 if the hook was created successfully, or code 409, if a webhook for the same callback URL already exists. Or something else, if this went bad :)
You can now test the hook, by creating a survey on the remote service and then watch it getting replicated in your Rails app.
Hope this helps...
(This question is a follow-up to How do I handle long requests for a Rails App so other users are not delayed too much? )
A user submits an answer to my Rails app and it gets checked in the back-end for up to 10 seconds. This would cause delays for all other users, so I'm trying out the delayed_job gem to move the checking to a Worker process. The Worker code returns the results back to the controller. However, the controller doesn't realize it's supposed to wait patiently for the results, so it causes an error.
How do I get the controller to wait for the results and let the rest of the app handle simple requests meanwhile?
In Javascript, one would use callbacks to call the function instead of returning a value. Should I do the same thing in Ruby and call back the controller from the Worker?
Update:
Alternatively, how can I call a controller method from the Worker? Then I could just call the relevant actions when its done.
This is the relevant code:
Controller:
def submit
question = Question.find params[:question]
user_answer = params[:user_answer]
#result, #other_stuff = SubmitWorker.new.check(question, user_answer)
render_ajax
end
submit_worker.rb :
class SubmitWorker
def check
#lots of code...
end
handle_asynchronously :check
end
Using DJ to offload the work is absolutely fine and normal, but making the controller wait for the response rather defeats the point.
You can add some form of callback to the end of your check method so that when the job finishes your user can be notified.
You can find some discussion on performing notifications in this question: push-style notifications simliar to Facebook with Rails and jQuery
Alternatively you can have your browser periodically call a controller action that checks for the results of the job - the results would ideally be an ActiveRecord object. Again you can find discussion on periodic javascript in this question: Rails 3 equivalent for periodically_call_remote
I think what you are trying to do here is little contradicting, because you use delayed_job when do done want to interrupt the control flow (so your users don't want to want until the request completes).
But if you want your controller to want until you get the results, then you don't want to use background processes like delayed_job.
You might want to think of different way of notifying the user, after you have done your checking, while keeping the background process as it is.