How to organize delayed results processing in microservices communication? - ruby-on-rails

I have the following system: my Rails server issues commands to the Flask server and the latest one responses immediately with status 200. After that Flask server runs a background task with some time-consuming function. After a little while, it comes up with some results and designed to send data back to the Rails server via HTTP (see diagram)
Each Flask data portion can affect several Rails models (User, Post etc...). Here I faced with two questions:
How I should structure my controllers/actions on the Rails side in this case? Currently, I think about one controller and each action of it corresponds to Python 'delayed' data portion.
Is it a normal way of microservices communication? Or I can organize it in a different, more simple way?

This sounds like pretty much your standard webhook process. Rails pokes Flask with a GET or POST request and Flask pokes back after a while.
For example lets say we have reports, and after creating the report we need flask to verify the report:
class ReportsController
# POST /reports
def create
#report = Report.new(report_params)
if #report.save
FlaskClient.new.verify(report) # this could be delegated to a background job
redirect_to #report
else
render :new
end
end
# PATCH /reports/:id/verify
def verify
# process request from flask
end
end
class FlaskClient
include Httparty
base_uri 'example.com/api'
format 'json'
def verify(report)
self.class.post('/somepath', data: { id: report.id, callback_url: "/reports/#{report.id}/verify", ... })
end
end
Of course the Rails app does not actually know when Flask will respond or that Flask and the background service are different. It just sends and responds to http requests. And you definitely don't want rails to wait around so save what you have and then later the hook can update the data.
If you have to update the UI on the Rails side without the user having to refresh manually you can use polling or websockets in the form of ActionCable.

Related

Rails 5 actioncable as a standalone server

There's not TOO much documentation on action cable so I'm a little lost on this.
I'm playing with a rails 5 app, and I'm trying to use the rails5 app as purely an api and hosting my JS elsewhere. So when I start my actioncable server, i am able to connect to the websocket pretty easily just using my built in browser socket support with:
var socket = new WebSocket('localhost:3000/cable')
// and then do
socket.onmessage = function(data) { console.log(data) }
I connect successfully. I'm getting pings in the form of
MessageEvent {isTrusted: true, data: "{"type":"ping","message":1462992407}", ... etc
Except I can't seem to broadcast any messages down to the client. I tried:
ActionCable.server.broadcast('test',{ yes: true })
But only the pings come in. ActionCable comes with its own concepts that I haven't wrapped my head around fully yet like channels and stuff which "just work" in rails apps. But how can I successfully build a separate standalone JS app using actioncable's socket server?
I use ActionCable with an iOS application. Everything works just fine.
ActionCable uses pub/sub pattern.
Pub/Sub, or Publish-Subscribe, refers to a message queue paradigm
whereby senders of information (publishers), send data to an abstract
class of recipients (subscribers), without specifying individual
recipients. Action Cable uses this approach to communicate between the
server and many clients.
Which means that you should create a new channel first,
rails g channel my_channel
Then in your channel send some test message:
# app/channels/my_channel.rb
class MyChannel < ApplicationCable::Channel
def subscribed
stream_from "my_channel"
ActionCable.server.broadcast "my_channel", 'Test message'
end
def unsubscribed
# Any cleanup needed when channel is unsubscribed
end
end
Then send the following to your server:
{'command': 'subscribe', 'identifier': {\'channel\':\'MyChannel\'}}
In return you'll get your first frame.

Rails and slow third-party APIs

I'm building a Rails app with a huge third-party APIs usage. APIs are not like common web APIs they are about system linux tools, so requests to these APIs will take rather long time (1-5s).
Example:
I have a Document model like
def index
#documents = current_user.documents # just simple DB request
end
def create
#document = Document.new(document_params)
#document.sid = call_my_slow_api(#document.title)
#document.save
end
Let's say Alice start create request and waiting for reply. Same time Bob start index request. If i have only 1 worker it gonna be a problem (Bob will see the index only after Alice get reply).
What is the best way to separate API calls (call_my_slow_api) logic in Rails?
Thanks.
Background jobs might be the way to go, if you are on the latest version of rails 4 (4.3?) activejob is there as a common DSL to tie in with any worker service. If not Resque, DelayedJobs, etc are those you may wanna explore.

Sending data from an analytics engine to a Rails server

I have an analytics engine which periodically packages a bunch of stats in JSON format. I want to send these packages to a Rails server. Upon a package arriving, the Rails server should examine it, generate a model instance out of it (for historical purposes), and then display the contents to the user. I've thought of two approaches.
1) Have a little app residing on the same host as the Rails server to be listening for these packages (using ZeroMQ). Upon receiving a package, the app would invoke a Rails action through CURL, passing on the package as a parameter. My concern with this approach is that my Rails server checks that only signed-in users can access actions which affect models. By creating an action accessible to this listening app (and therefore other entities), am I exposing myself to a major security flaw?
2) The second approach is to simply have the listening app dump the package into a special database table. The Rails server will then periodically check this table for new packages. Upon detecting one or more, it will process them and remove them from the table.
This is the first time I'm doing something like this, so if you have techniques or experiences you can share for better solutions, I'd love to learn.
Thank you.
you can restrict access to a certain call by limiting the host name that is allowed for the request in routes.rb
post "/analytics" => "analytics#create", :constraints => {:ip => /127.0.0.1/}
If you want the users to see updates, you can use polling to refresh the page every minute orso.
1) Yes you are exposing a major security breach unless :
Your zeroMQ app provides the needed data to do authentification and authorization on the rails side
Your rails app is configured to listen only on the 127.0.0.1 interface and is thus not accessible from the outside
Like Benjamin suggests, you restrict specific routes to certain IP
2) This approach looks a lot like what delayed_job does. You might wanna take a look there : https://github.com/collectiveidea/delayed_job and use a rake task to add a new job.
In short, your listening app will call a rake task that will add a custom delayed_job when receiving a packet. Then let delayed_job handle the load. You benefit from delayed_job goodness (different queues, scaling, ...). The hard part is getting the result.
One idea would be to associated a unique ID with each job, and have the delayed_job task output the result in a data store wich associated the job ID with the result. This data store can be a simple relational table
+----+--------+
| ID | Result |
+----+--------+
or a memecache/redis/whatever instance. You just need to poll that data store looking for the result associated with the job ID. And delete everything when you are done displaying that to the user.
3) Why don't you directly POST the data to the rails server ?
Following Benjamin's lead, I implemented a filter for this particular action.
def verify_ip
#ips = ['127.0.0.1']
if not #ips.include? request.remote_ip
redirect_to root_url
end
end
The listening app on the localhost now invokes the action, passing the JSON package received from the analytics engine as a param. Thank you.

Should tweets be done in the background?

On a high Twitter app site. Where the app sends tweets via the users oauth credentials. Should the tweets be sent in the background, via a background worker (Resque, Delayed Job, etc)? Or should the web process handle it?
It really depends on your use case. Twitter itself I think sends an AJAX request to the API. You could do the same if it makes sense in your interface, but it does mean that you're using a web process to do this. One of the benefits to this is that you can verify that the request was successful before returning a resopnse to the user. This is much easier than a scenario where you queue something in the background, it fails, and you want to alert the user (e.g. through a "real-time" ajax/socket-based message system or a flash notice on another request).
If you aren't worried about showing the Tweets (e.g. your application is sending as part of a larger action), then doing it in the background is definitely the way to go.
Resque is great and jobs are really lightweight, so you could a quick integration to process these in the background pretty quickly.
# app/jobs/send_tweet.rb
class SendTweet
#queue = :tweets
def self.perform(user_id, content)
user = User.find(user_id)
# send Tweet
end
end
# app/controllers/tweet_controller.rb
def create
# assuming some things here, like validation and a `current_user` method
Resque.enqueue(SendTweet, current_user.id, params[:tweet][:message])
redirect_to :index
end

How do I see the whole HTTP request in Rails

I have a Rails application but after some time of development/debugging I realized that it would be very helpful to be able to see the whole HTTP request in the logfiles - log/development.log, not just the parameters.
I also want to have a separate logfile based on user, not session.
Any ideas will be appreciated!
You can rapidly see the request.env in a view via:
VIEW: <%= request.env.inspect %>
If instead you want to log it in development log, from your controller:
CONTROLLER: Rails.logger.info(request.env)
Here you can see a reference for the Request object.
Rails automatically sets up logging to a file in the log/ directory using Logger from the Ruby Standard Library. The logfile will be named corresponding to your environment, e.g. log/development.log.
To log a message from either a controller or a model, access the Rails logger instance with the logger method:
class YourController < ActionController::Base
def index
logger.info request.env
end
end
About the user, what are you using to authenticate It?
That logger.info request.env code works fine in a Rails controller, but to see a more original version of that, or if you're using Grape or some other mounted app, you have to intercept the request on its way through the rack middleware chain...
Put this code in your lib directory (or at the bottom of application.rb):
require 'pp'
class Loggo
def initialize(app)
#app = app
end
def call(env)
pp env
#app.call(env)
end
end
then in with the other configs in application.rb:
config.middleware.use "Loggo"
You can use rack middleware to log the requests as the middleware sees them (as parsed by the http-server and as transformed by preceding middleware). You can also configure your http-server to log full requests, if your http-server supports such a feature.
The http-server (web-server) is responsible for receiving http requests, parsing them, and transmitting data structures to the application-server (e.g., a rack application). The application-server does not see the original request, but sees what the http-server sends its way.
I've initially used the code snippet by #AlexChaffee, but I've since switched to using mitmproxy, a specialized HTTP proxy that records the requests and responses passing through it.
This is obviously only helpful for development scenarios when you control the applications making the requests. You might be able to achieve similar results with a reverse proxy for production applications (the advantage being that you don't have to touch the Rails application itself for this), but I haven't looked into this.

Resources