Rack middleware and thread-safety - ruby-on-rails

I have a custom rack middleware used by my Rails 4 application. The middleware itself is just here to default Accept and Content-Type headers to application/json if the client did not provide a valid information (I'm working on an API). So before each request it changes those headers and after each request it adds a custom X-Something-Media-Type head with a custom media type information.
I would like to switch to Puma, therefore I'm a bit worried about the thread-safety of such a middleware. I did not play with instances variables, except once for the common #app.call that we encounter in every middleware, but even here I reproduced something I've read in RailsCasts' comments :
def initialize(app)
#app = app
end
def call(env)
dup._call(env)
end
def _call(env)
...
status, headers, response = #app.call(env)
...
Is the dup._call really useful in order to handle thread-safety problems ?
Except that #app instance variable I only play with the current request built with the current env variable :
request = Rack::Request.new(env)
And I call env.update to update headers and forms informations.
Is it dangerous enough to expect some issues with that middleware when I'll switch from Webrick to a concurrent web server such as Puma ?
If yes, do you know a handful way to make some tests en isolate portions of my middleware which are non-thread-safe ?
Thanks.

Yes, it's necessary to dup the middleware to be thread-safe. That way, anything instance variables you set from _call will be set on the duped instance, not the original. You'll notice that web frameworks that are built around Rack work this way:
Pakyow
Sinatra
One way to unit test this is to assert that _call is called on a duped instance rather than the original.

Related

Ruby library with least footprint to host a very simple single endpoint API

I have a very simple number crunching Ruby function that I want to make available via a web API. The API is essentially a single endpoint, e.g. http://example.com/crunch/<number> and it returns JSON output.
I can obviously install Rails and implement this quickly. I require no more help from a 'framework' other than to handle HTTP for me. No ORM, MVC and other frills.
On the far end, I can write some Ruby code to listen on a port and accept GET request and parse HTTP headers etc. etc. I don't want to re-invent that wheel either.
What can I use to expose a minimal API to the web using something with the least footprint/dependencies. I read about Sinatra, Ramaze, etc., but I believe there can be a way to do something even simpler. Can I just hack some code on top of Rack to do what I am trying to do?
Or in other words, what will be the simplest Ruby equivalent of the following code in nodejs:
var http = require('http');
http.createServer(function (req, res) {
res.writeHead(200, {'Content-Type': 'text/plain'});
var ans = crunch(number);
res.end(ans);
}).listen(1337, "127.0.0.1");
console.log('Server running at http://127.0.0.1:1337/');
You seem like you want to use Rack directly. "Rack from the Beginning" is a decent tutorial that should get you started.
It'll probably look something like this:
class CrunchApp
def self.crunch(crunchable)
# top-secret crunching
end
def self.call(env)
crunchy_stuff = input(env)
[200, {}, crunch(crunchy_stuff)]
end
private
def self.input(env)
request = Rack::Request.new(env)
request.params['my_input']
end
end
Rack::Server.start app: CrunchApp
But I must say, using that instead of something like Sinatra seems silly unless this is just a fun project to play with things. See their 'Hello World':
require 'sinatra'
get '/hi' do
"Hello World!"
end
Ruby-Grape is a good option for your use case. It has a minimal implementation over Rack that allow the creation of simple REST-API endpoints.
Cuba is another good option with a thin layer over Rack itself.sample post
If you are familiar with Rails you can use the Rails API gem which is very well documented with minor overhead. Remember also that Rails-API will be part of Rails 5.
Last, but not last you can implement it on Rack directly.

HTTP request uuid or request start time in Ruby applications

I need to get a request uuid or time when server get a request. It's easy in Rails, but I'm working on a gem and I would like it to be more generic. So I would like it to work also with Sinatra and every other Ruby application which works in a http server.
This is another problem, it's a gem. I can't put Time.now at the beggining of my application controller. I need it to be generic, so it should work with different frameworks.
What would you propose?
You can implement a Rack middleware which you can use independently from your actual application framework (as long as it used rack, which is true for at least Rails, Sinatra, Padriono and most other Ruby web frameworks).
Rails already includes a middleware for adding a unique ID to a request of required in ActionDispatch::RequestId. Another alternative could be the rack-request-id gem.
A minimal versions of this midleware could look like this:
class RequestIdMiddleware
def initialize(app)
#app = app
end
def call(env)
env['request_id'] = env['HTTP_X_REQUEST_ID'] || SecureRandom.uuid
env['request_started_at'] = Time.now
#app.call(env)
end
end
You can then use this middleware in your config.ru or by adding this to your application.rb in Rails:
config.middleware.use RequestIdMiddleware

Change log level for single controller or action in rails

We are running a rails project behind haproxy. There is a keep-alive sent to the application every second. This is causing very noisy log files which makes it a bit of a pain to dig through and is making them unnecessarily large.
My first thought was to change the logging level for that action to debug, but someone else proposed changing the logging level in an around_filter. I am not crazy about that idea, but it could just be how I implemented it. I am open to different solutions, but the general requirements are that I can quiet those actions, but I could change the logging level if I needed to see them for whatever reason.
Another solution is to insert some Rack middleware which handles the keep-alive check before it gets to the Rails ApplicationController lifecycle.
Step 1: make some middleware which respondes to the keep-alive check. In my example the keep-alive request is a GET /health-check so it would look like:
class HealthCheckMiddleware
def initialize(app)
#app = app
end
def call(env)
if env['PATH_INFO'] == '/health-check'
return [200, {}, ['healthy']]
end
#app.call(env)
end
end
Of course, adjust this health check as necessary. Maybe you need to check other Request / CGI variables...
Step 2: make sure you insert this middleware before Rails::Rack::Logger:
config.middleware.insert_before Rails::Rack::Logger, "HealthCheckMiddleware"
Now your middleware will handle the health check and your logs have been totally by-passed.

What is Rack middleware?

What is Rack middleware in Ruby? I couldn't find any good explanation for what they mean by "middleware".
Rack as Design
Rack middleware is more than "a way to filter a request and response" - it's an implementation of the pipeline design pattern for web servers using Rack.
It very cleanly separates out the different stages of processing a request - separation of concerns being a key goal of all well designed software products.
For example with Rack I can have separate stages of the pipeline doing:
Authentication: when the request arrives, are the users logon details correct? How do I validate this OAuth, HTTP Basic Authentication, name/password?
Authorization: "is the user authorised to perform this particular task?", i.e. role-based security.
Caching: have I processed this request already, can I return a cached result?
Decoration: how can I enhance the request to make downstream processing better?
Performance & Usage Monitoring: what stats can I get from the request and response?
Execution: actually handle the request and provide a response.
Being able to separate the different stages (and optionally include them) is a great help in developing well structured applications.
Community
There's also a great eco-system developing around Rack Middleware - you should be able to find pre-built rack components to do all of the steps above and more. See the Rack GitHub wiki for a list of middleware.
What's Middleware?
Middleware is a dreadful term which refers to any software component/library which assists with but is not directly involved in the execution of some task. Very common examples are logging, authentication and the other common, horizontal processing components. These tend to be the things that everyone needs across multiple applications but not too many people are interested (or should be) in building themselves.
More Information
The comment about it being a way to filter requests probably comes from the RailsCast episode 151: Rack Middleware screen cast.
Rack middleware evolved out of Rack and there is a great intro at Introduction to Rack middleware.
There's an intro to middleware on Wikipedia here.
First of all, Rack is exactly two things:
A webserver interface convention
A gem
Rack - The Webserver Interface
The very basics of rack is a simple convention. Every rack compliant webserver will always call a call method on an object you give him and serve the result of that method. Rack specifies exactly how this call method has to look like, and what it has to return. That's rack.
Let's give it a simple try. I'll use WEBrick as rack compliant webserver, but any of them will do. Let's create a simple web application that returns a JSON string. For this we'll create a file called config.ru. The config.ru will automatically be called by the rack gem's command rackup which will simply run the contents of the config.ru in a rack-compliant webserver. So let's add the following to the config.ru file:
class JSONServer
def call(env)
[200, {"Content-Type" => "application/json"}, ['{ "message" : "Hello!" }']]
end
end
map '/hello.json' do
run JSONServer.new
end
As the convention specifies our server has a method called call that accepts an environment hash and returns an array with the form [status, headers, body] for the webserver to serve. Let's try it out by simply calling rackup. A default rack compliant server, maybe WEBrick or Mongrel will start and immediately wait for requests to serve.
$ rackup
[2012-02-19 22:39:26] INFO WEBrick 1.3.1
[2012-02-19 22:39:26] INFO ruby 1.9.3 (2012-01-17) [x86_64-darwin11.2.0]
[2012-02-19 22:39:26] INFO WEBrick::HTTPServer#start: pid=16121 port=9292
Let's test our new JSON server by either curling or visiting the url http://localhost:9292/hello.json and voila:
$ curl http://localhost:9292/hello.json
{ message: "Hello!" }
It works. Great! That's the basis for every web framework, be it Rails or Sinatra. At some point they implement a call method, work through all the framework code, and finally return a response in the typical [status, headers, body] form.
In Ruby on Rails for example the rack requests hits the ActionDispatch::Routing.Mapper class which looks like this:
module ActionDispatch
module Routing
class Mapper
...
def initialize(app, constraints, request)
#app, #constraints, #request = app, constraints, request
end
def matches?(env)
req = #request.new(env)
...
return true
end
def call(env)
matches?(env) ? #app.call(env) : [ 404, {'X-Cascade' => 'pass'}, [] ]
end
...
end
end
So basically Rails checks, dependent on the env hash if any route matches. If so it passes the env hash on to the application to compute the response, otherwise it immediately responds with a 404. So any webserver that is is compliant with the rack interface convention, is able to serve a fully blown Rails application.
Middleware
Rack also supports the creation of middleware layers. They basically intercept a request, do something with it and pass it on. This is very useful for versatile tasks.
Let's say we want to add logging to our JSON server that also measures how long a request takes. We can simply create a middleware logger that does exactly this:
class RackLogger
def initialize(app)
#app = app
end
def call(env)
#start = Time.now
#status, #headers, #body = #app.call(env)
#duration = ((Time.now - #start).to_f * 1000).round(2)
puts "#{env['REQUEST_METHOD']} #{env['REQUEST_PATH']} - Took: #{#duration} ms"
[#status, #headers, #body]
end
end
When it gets created, it saves itself a copy of the actual rack application. In our case that's an instance of our JSONServer. Rack automatically calls the call method on the middleware and expects back a [status, headers, body] array, just like our JSONServer returns.
So in this middleware, the start point is taken, then the actual call to the JSONServer is made with #app.call(env), then the logger outputs the logging entry and finally returns the response as [#status, #headers, #body].
To make our little rackup.ru use this middleware, add a use RackLogger to it like this:
class JSONServer
def call(env)
[200, {"Content-Type" => "application/json"}, ['{ "message" : "Hello!" }']]
end
end
class RackLogger
def initialize(app)
#app = app
end
def call(env)
#start = Time.now
#status, #headers, #body = #app.call(env)
#duration = ((Time.now - #start).to_f * 1000).round(2)
puts "#{env['REQUEST_METHOD']} #{env['REQUEST_PATH']} - Took: #{#duration} ms"
[#status, #headers, #body]
end
end
use RackLogger
map '/hello.json' do
run JSONServer.new
end
Restart the server and voila, it outputs a log on every request. Rack allows you to add multiple middlewares that are called in the order they are added. It's just a great way to add functionality without changing the core of the rack application.
Rack - The Gem
Although rack - first of all - is a convention it also is a gem that provides great functionality. One of them we already used for our JSON server, the rackup command. But there's more! The rack gem provides little applications for lots of use cases, like serving static files or even whole directories. Let's see how we serve a simple file, for example a very basic HTML file located at htmls/index.html:
<!DOCTYPE HTML>
<html>
<head>
<title>The Index</title>
</head>
<body>
<p>Index Page</p>
</body>
</html>
We maybe want to serve this file from the website root, so let's add the following to our config.ru:
map '/' do
run Rack::File.new "htmls/index.html"
end
If we visit http://localhost:9292 we see our html file perfectly rendered. That's was easy, right?
Let's add a whole directory of javascript files by creating some javascript files under /javascripts and adding the following to the config.ru:
map '/javascripts' do
run Rack::Directory.new "javascripts"
end
Restart the server and visit http://localhost:9292/javascript and you'll see a list of all javascript files you can include now straight from anywhere.
I had a problem understanding Rack myself for a good amount of time. I only fully understood it after working on making this miniature Ruby web server myself. I've shared my learnings about Rack (in the form of a story) here on my blog: http://blog.gauravchande.com/what-is-rack-in-ruby-rails
Feedback is more than welcome.
What is Rack?
Rack provides a minimal interface between between webservers supporting Ruby and Ruby frameworks.
Using Rack you can write a Rack Application.
Rack will pass the Environment hash (a Hash, contained inside a HTTP request from a client, consisting of CGI-like headers) to your Rack Application which can use things contained in this hash to do whatever it wants.
What is a Rack Application?
To use Rack, you must provide an 'app' - an object that responds to the #call method with the Environment Hash as a parameter (typically defined as env). #call must return an Array of exactly three values:
the Status Code (eg '200'),
a Hash of Headers,
the Response Body (which must respond to the Ruby method, each).
You can write a Rack Application that returns such an array - this will be sent back to your client, by Rack, inside a Response (this will actually be an instance of the Class Rack::Response [click to go to docs]).
A Very Simple Rack Application:
gem install rack
Create a config.ru file - Rack knows to look for this.
We will create a tiny Rack Application that returns a Response (an instance of Rack::Response) who's Response Body is an array that contains a String: "Hello, World!".
We will fire up a local server using the command rackup.
When visiting the relevant port in our browser we will see "Hello, World!" rendered in the viewport.
#./message_app.rb
class MessageApp
def call(env)
[200, {}, ['Hello, World!']]
end
end
#./config.ru
require_relative './message_app'
run MessageApp.new
Fire up a local server with rackup and visit localhost:9292 and you should see 'Hello, World!' rendered.
This is not a comprehensive explanation, but essentially what happens here is that the Client (the browser) sends a HTTP Request to Rack, via your local server, and Rack instantiates MessageApp and runs call, passing in the Environment Hash as a parameter into the method (the env argument).
Rack takes the return value (the array) and uses it to create an instance of Rack::Response and sends that back to the Client. The browser uses magic to print 'Hello, World!' to the screen.
Incidentally, if you want to see what the environment hash looks like, just put puts env underneath def call(env).
Minimal as it is, what you have written here is a Rack application!
Making a Rack Application interact with the Incoming Environment hash
In our little Rack app, we can interact with the env hash (see here for more about the Environment hash).
We will implement the ability for the user to input their own query string into the URL, hence, that string will be present in the HTTP request, encapsulated as a value in one of the key/value pairs of the Environment hash.
Our Rack app will access that query string from the Environment hash and send that back to the client (our browser, in this case) via the Body in the Response.
From the Rack docs on the Environment Hash:
"QUERY_STRING: The portion of the request URL that follows the ?, if any. May be empty, but is always required!"
#./message_app.rb
class MessageApp
def call(env)
message = env['QUERY_STRING']
[200, {}, [message]]
end
end
Now, rackup and visit localhost:9292?hello (?hello being the query string) and you should see 'hello' rendered in the viewport.
Rack Middleware
We will:
insert a piece of Rack Middleware into our codebase - a class: MessageSetter,
the Environment hash will hit this class first and will be passed in as a parameter: env,
MessageSetter will insert a 'MESSAGE' key into the env hash, its value being 'Hello, World!' if env['QUERY_STRING'] is empty; env['QUERY_STRING'] if not,
finally, it will return #app.call(env) - #app being the next app in the 'Stack': MessageApp.
First, the 'long-hand' version:
#./middleware/message_setter.rb
class MessageSetter
def initialize(app)
#app = app
end
def call(env)
if env['QUERY_STRING'].empty?
env['MESSAGE'] = 'Hello, World!'
else
env['MESSAGE'] = env['QUERY_STRING']
end
#app.call(env)
end
end
#./message_app.rb (same as before)
class MessageApp
def call(env)
message = env['QUERY_STRING']
[200, {}, [message]]
end
end
#config.ru
require_relative './message_app'
require_relative './middleware/message_setter'
app = Rack::Builder.new do
use MessageSetter
run MessageApp.new
end
run app
From the Rack::Builder docs we see that Rack::Builder implements a small DSL to iteratively construct Rack applications. This basically means that you can build a 'Stack' consisting of one or more Middlewares and a 'bottom level' application to dispatch to. All requests going through to your bottom-level application will be first processed by your Middleware(s).
#use specifies middleware to use in a stack. It takes the middleware as an argument.
Rack Middleware must:
have a constructor that takes the next application in the stack as a parameter.
respond to the call method that takes the Environment hash as a parameter.
In our case, the 'Middleware' is MessageSetter, the 'constructor' is MessageSetter's initialize method, the 'next application' in the stack is MessageApp.
So here, because of what Rack::Builder does under the hood, the app argument of MessageSetter's initialize method is MessageApp.
(get your head around the above before moving on)
Therefore, each piece of Middleware essentially 'passes down' the existing Environment hash to the next application in the chain - so you have the opportunity to mutate that environment hash within the Middleware before passing it on to the next application in the stack.
#run takes an argument that is an object that responds to #call and returns a Rack Response (an instance of Rack::Response).
Conclusions
Using Rack::Builder you can construct chains of Middlewares and any request to your application will be processed by each Middleware in turn before finally being processed by the final piece in the stack (in our case, MessageApp). This is extremely useful because it separates-out different stages of processing requests. In terms of 'separation of concerns', it couldn't be much cleaner!
You can construct a 'request pipeline' consisting of several Middlewares that deal with things such as:
Authentication
Authorisation
Caching
Decoration
Performance & Usage Monitoring
Execution (actually handle the request and provide a response)
(above bullet points from another answer on this thread)
You will often see this in professional Sinatra applications. Sinatra uses Rack! See here for the definition of what Sinatra IS!
As a final note, our config.ru can be written in a short-hand style, producing exactly the same functionality (and this is what you'll typically see):
require_relative './message_app'
require_relative './middleware/message_setter'
use MessageSetter
run MessageApp.new
And to show more explicitly what MessageApp is doing, here is its 'long-hand' version that explicitly shows that #call is creating a new instance of Rack::Response, with the required three arguments.
class MessageApp
def call(env)
Rack::Response.new([env['MESSAGE']], 200, {})
end
end
Useful links
Complete code for this post (Github repo commit)
Good Blog Post, "Introduction to Rack Middleware"
Some good Rack documentation
Rack is a gem which provides a simple interface to abstract HTTP request/response. Rack sits between web frameworks (Rails, Sinatra etc) and web servers (unicorn, puma) as an adaptor. From above image this keeps unicorn server completely independent from knowing about rails and rails doesn't know about unicorn. This is a good example of loose coupling, separation of concerns.
Above image is from this rails conference talk on rack https://youtu.be/3PnUV9QzB0g I recommend watching it for deeper understanding.
config.ru minimal runnable example
app = Proc.new do |env|
[
200,
{
'Content-Type' => 'text/plain'
},
["main\n"]
]
end
class Middleware
def initialize(app)
#app = app
end
def call(env)
#status, #headers, #body = #app.call(env)
[#status, #headers, #body << "Middleware\n"]
end
end
use(Middleware)
run(app)
Run rackup and visit localhost:9292. The output is:
main
Middleware
So it is clear that the Middleware wraps and calls the main app. Therefore it is able to pre-process the request, and post-process the response in any way.
As explained at: http://guides.rubyonrails.org/rails_on_rack.html#action-dispatcher-middleware-stack , Rails uses Rack middlewares for a lot of it's functionality, and you can add you own too with config.middleware.use family methods.
The advantage of implementing functionality in a middleware is that you can reuse it on any Rack framework, thus all major Ruby ones, and not just Rails.
Rack middleware is a way to filter a request and response coming into your application. A middleware component sits between the client and the server, processing inbound requests and outbound responses, but it's more than interface that can be used to talk to web server. It’s used to group and order modules, which are usually Ruby classes, and specify dependency between them. Rack middleware module must only: – have constructor that takes next application in stack as parameter – respond to “call” method, that takes environment hash as a parameter. Returning value from this call is an array of: status code, environment hash and response body.
I've used Rack middleware to solve a couple problems:
Catching JSON parse errors with custom Rack middleware and returning nicely formatted error messages when client submits busted JSON
Content Compression via Rack::Deflater
It afforded pretty elegant fixes in both cases.
Rack - The Interface b/w Web & App Server
Rack is a Ruby package which provides an interface for a web server to communicate with the application. It is easy to add middleware components between the web server and the app to modify the way your request/response behaves. The middleware component sits between the client and the server, processing inbound requests and outbound responses.
In layman words, It is basically just a set of guidelines for how a server and a Rails app (or any other Ruby web app) should talk to each other.
To use Rack, provide an "app": an object that responds to the call method, taking the environment hash as a parameter, and returning an Array with three elements:
The HTTP response code
A Hash of headers
The response body, which must respond to each request.
For more explanation, you can follow the below links.
1. https://rack.github.io/
2. https://redpanthers.co/rack-middleware/
3. https://blog.engineyard.com/2015/understanding-rack-apps-and-middleware
4. https://guides.rubyonrails.org/rails_on_rack.html#resources
In rails, we have config.ru as a rack file, you can run any rack file with rackup command. And the default port for this is 9292. To test this, you can simply run rackup in your rails directory and see the result. You can also assign port on which you want to run it. Command to run rack file on any specific port is
rackup -p PORT_NUMBER

Re-entrant subrequests in Rack/Rails

I've got a couple Engine plugins with metal endpoints that implement some extremely simple web services I intend to share across multiple applications. They work just fine as they are, but obviously, while loading them locally for development and testing, sending Net::HTTP a get_response message to ask localhost for another page from inside the currently executing controller object results in instant deadlock.
So my question is, does Rails' (or Rack's) routing system provide a way to safely consume a web service which may or may not be a part of the same app under the same server instance, or will I have to hack a special case together with render_to_string for those times when the hostname in the URI matches my own?
It doesn't work in development because it's only serving one request at a time, and the controller's request gets stuck. If you need this you can run multiple server locally behind a load balancer. I recommend using Passenger even for development (and the prefpane if you are on OS X).
My recommendation for you is to separate the internal web services and the applications that use them. This way you do not duplicate the code and you can easily scale and control them individually.
This is in fact possible. However, you need to ensure that the services you call are not calling each other recursively.
A really simple "reentrant" Rack middleware could work like this:
class Reentry < Struct.new(:app)
def call(env)
#current_env = env
app.call(env.merge('reentry' => self)
end
def call_rack(request_uri)
env_for_recursive_call = #current_env.dup
env_for_recursive_call['PATH_INFO'] = request_uri # ...and more
status, headers, response = call(env_for_recursive_call)
# for example, return response as a String
response.inject(''){|part, buf| part + buf }
end
end
Then in the calling code:
env['reentry'].call_rack('/my/api/get-json')
A very valid use case for this is sideloading API responses in JSON
format within your main page.
Obviously the success of this technique will depend on the sophistication
of your Rack stack (as some parts of the Rack env will not like being reused).

Resources