Exclusively server-side caching of a Rails.app with Rack::Cache - ruby-on-rails

I have the following problem: I want to cache the result of an action in Redis. For this reason, I use https://github.com/jodosha/redis-rack-cache. The fact that an action should be cached by Rack::Cache is determined by setting the appropriate HTTP header information in Rails, e.g.:
response.headers['Cache-Control'] = 'max-age=3600, public, must-revalidate'
Now, Rack::Cache will correctly cache the response in Redis. However, this header does also tell the browser to cache the response, which I don't want! The request should be cached exclusively on the server-side.
As a workaround, I am replacing the header in nginx, which I use as a reverse proxy, but there must be a more elegant way. Does anybody know how to do it?
Best regards,
Martin

One option would be to write your own middleware that sits above Rack::Cache and then removes these Cache-Control headers from the response.
Something as simple as:
def call(env)
status, headers, body = #app.call(env)
headers.delete("Cache-Control")
[status, headers, body]
end
would work as a middleware.

Related

Rack middleware and thread-safety

I have a custom rack middleware used by my Rails 4 application. The middleware itself is just here to default Accept and Content-Type headers to application/json if the client did not provide a valid information (I'm working on an API). So before each request it changes those headers and after each request it adds a custom X-Something-Media-Type head with a custom media type information.
I would like to switch to Puma, therefore I'm a bit worried about the thread-safety of such a middleware. I did not play with instances variables, except once for the common #app.call that we encounter in every middleware, but even here I reproduced something I've read in RailsCasts' comments :
def initialize(app)
#app = app
end
def call(env)
dup._call(env)
end
def _call(env)
...
status, headers, response = #app.call(env)
...
Is the dup._call really useful in order to handle thread-safety problems ?
Except that #app instance variable I only play with the current request built with the current env variable :
request = Rack::Request.new(env)
And I call env.update to update headers and forms informations.
Is it dangerous enough to expect some issues with that middleware when I'll switch from Webrick to a concurrent web server such as Puma ?
If yes, do you know a handful way to make some tests en isolate portions of my middleware which are non-thread-safe ?
Thanks.
Yes, it's necessary to dup the middleware to be thread-safe. That way, anything instance variables you set from _call will be set on the duped instance, not the original. You'll notice that web frameworks that are built around Rack work this way:
Pakyow
Sinatra
One way to unit test this is to assert that _call is called on a duped instance rather than the original.

Rails 3.1 and Http Page Caching

Given that Heroku Cedar doesn't have http caching provided by Varnish I would like to use Rack::Cache.
I have been told that rails 3.1.1 have Rack::Cache active by default, I just need to make sure to have in the configuration:
config.action_controller.perform_caching = true
and I need to pick a cache store, for this experiment I'm using:
config.cache_store = :memory_store
In the action of the page I want to cache I've added the following lines:
response.header['Cache-Control'] = 'public, max-age=300'
response.header['Expires'] = CGI.rfc1123_date(Time.now + 300)
This code used to work fine with Varnish, the first request would return a 200 and the subsequent (for 5 mins) would return a 304.
This doesn't happen with Rails 3.1 and Heroku Cedar Stack.
I do get those headers in the response but subsequent requests returns 200 instead of 304.
What am I doing wrong? Thank you.
As you noted, the Cedar stack doesn't use Varnish. That means a web request will always hit the ruby server.
With that in mind, Rack::Cache will respect your headers and serve the cached content.
However, since the request is actually going past the http layer into the rails app, the response will always be 200 since the cache doesn't happen at the http layer anymore.
To confirm this is true, insert this in one of your cached actions:
<%= Time.now.to_i %>
Then, reload the page several times and you'll notice the timestamp won't change.

Overriding rails Cache-Control header on redirect

Whether I do:
head 302
or
head 307
or
redirect_to
calls in the same controller action to
response.headers['Cache-Control'] = "public, max-age=86400"
have no effect. Rails sends:
Cache-Control: no-cache
no matter what. I need to send the Cache-Control header to instruct an edge cache to serve the redirect for a day. Is this possible?
You can't set Cache-Control directly into the headers (anymore?), as you need to modify the response.cache_control object (since it will be used to set the Cache-Control header later).
Luckily, the expires_in method takes care of this for you:
expires_in 1.day, :public => true
See more here:
http://apidock.com/rails/ActionController/ConditionalGet/expires_in
Try using this instead
response.headers['Cache-Control'] = 'public, max-age=300'
and make sure your in production mode. Rails wont cache in development.
With Rails 5 you can do
response.cache_control = 'public, max-age=86400'
. I need to send the Cache-Control header to instruct an edge cache to serve the redirect for a day.
How is this possible ? in case of temp redirect , browsers will always try to get original url first and on redirect they will try other url,which if cached on proxies can be served from there.
But again browser will still make first contact with your server.

Secure paperclip urls only for secure pages

I'm trying to find the best way to make paperclip urls secure, but only for secure pages.
For instance, the homepage, which shows images stored in S3, is http://mydomain.com and the image url is http://s3.amazonaws.com/mydomainphotos/89/thisimage.JPG?1284314856.
I have secure pages like https://mydomain.com/users/my_stuff/49 that has images stored in S3, but the S3 protocol is http and not https, so the user gets a warning from the browser saying that some elements on the page are not secure, blah blah blah.
I know that I can specify :s3_protocol in the model, but this makes everything secure even when it isn't necessary. So, I'm looking for the best way to change the protocol to https on the fly, only for secure pages.
One (probably bad) way would be to create a new url method like:
def custom_url(style = default_style, ssl = false)
ssl ? self.url(style).gsub('http', 'https') : self.url(style)
end
One thing to note is that I'm using the ssl_requirement plugin, so there might be a way to tie it in with that.
I'm sure there is some simple, standard way to do this that I'm overlooking, but I can't seem to find it.
If anyone stumbles upon this now: There is a solution in Paperclip since April 2012! Simply write:
Paperclip::Attachment.default_options[:s3_protocol] = ""
in an initializer or use the s3_protocol option inside your model.
Thanks to #Thomas Watson for initiating this.
If using Rails 2.3.x or newer, you can use Rails middleware to filter the response before sending it back to the user. This way you can detect if the current request is an HTTPS request and modify the calls to s3.amazonaws.com accordingly.
Create a new file called paperclip_s3_url_rewriter.rb and place it inside a directory that's loaded when the server starts. The lib direcotry will work, but many prefer to create an app/middleware directory and add this to the Rails application load path.
Add the following class to the new file:
class PaperclipS3UrlRewriter
def initialize(app)
#app = app
end
def call(env)
status, headers, response = #app.call(env)
if response.is_a?(ActionController::Response) && response.request.protocol == 'https://' && headers["Content-Type"].include?("text/html")
body = response.body.gsub('http://s3.amazonaws.com', 'https://s3.amazonaws.com')
headers["Content-Length"] = body.length.to_s
[status, headers, body]
else
[status, headers, response]
end
end
end
Then just register the new middleware:
Rails 2.3.x: Add the line below to environment.rb in the beginning of the Rails::Initializer.run block.
Rails 3.x: Add the line below to application.rb in the beginning of the Application class.
config.middleware.use "PaperclipS3UrlRewriter"
UPDATE:
I just edited my answer and added a check for response.is_a?(ActionController::Response) in the if statement. In some cases (maybe caching related) the response object is an empty array(?) and hence fails when request is called upon it.
UPDATE 2:
I edited the Rack/Middleware code example above to also update the Content-Length header. Otherwise the HTML body will be truncated by most browsers.
Use the following code in a controller class:
# locals/arguments/methods you must define or have available:
# attachment - the paperclip attachment object, not the ActiveRecord object
# request - the Rack/ActionController request
AWS::S3::S3Object.url_for \
attachment.path,
attachment.options[:bucket].to_s,
:expires_in => 10.minutes, # only necessary for private buckets
:use_ssl => request.ssl?
You can of course wrap this up nicely into a method.
FYI - some of the answers above do not work with Rails 3+, because ActionController::Response has been deprecated. Use the following:
class PaperclipS3UrlRewriter
def initialize(app)
#app = app
end
def call(env)
status, headers, response = #app.call(env)
if response.is_a?(ActionDispatch::BodyProxy) && headers && headers.has_key?("Content-Type") && headers["Content-Type"].include?("text/html")
body_string = response.body[0]
response.body[0] = body_string.gsub('http://s3.amazonaws.com', 'https://s3.amazonaws.com')
headers["Content-Length"] = body_string.length.to_s
[status, headers, response]
else
[status, headers, response]
end
end
end
And make sure that you add the middleware in a good place in the stack (I added it after Rack::Runtime)
config.middleware.insert_after Rack::Runtime, "PaperclipS3UrlRewriter"

How do I view the HTTP response to an ActiveResource request?

I am trying to debug an ActiveResource call that is not working.
What's the best way to view the HTTP response to the request ActiveResource is making?
Monkey patch the connection to enable Net::HTTP debug mode. See https://gist.github.com/591601 - I wrote it to solve precisely this problem. Adding this gist to your rails app will give you Net::HTTP.enable_debug! and Net::HTTP.disable_debug! that you can use to print debug info.
Net::HTTP debug mode is insecure and shouldn't be used in production, but is extremely informative for debugging.
Add a new file to config/initializers/ called 'debug_connection.rb' with the following content:
class ActiveResource::Connection
# Creates new Net::HTTP instance for communication with
# remote service and resources.
def http
http = Net::HTTP.new(#site.host, #site.port)
http.use_ssl = #site.is_a?(URI::HTTPS)
http.verify_mode = OpenSSL::SSL::VERIFY_NONE if http.use_ssl
http.read_timeout = #timeout if #timeout
# Here's the addition that allows you to see the output
http.set_debug_output $stderr
return http
end
end
This will print the whole network traffic to $stderr.
It's easy. Just look at the response that comes back. :)
Two options:
You have the source file on your computer. Edit it. Put a puts response.inspect at the appropriate place. Remember to remove it.
Ruby has open classes. Find the right method and redefine it to do exactly what you want, or use aliases and call chaining to do this. There's probably a method that returns the response -- grab it, print it, and then return it.
Here's a silly example of the latter option.
# Somewhere buried in ActiveResource:
class Network
def get
return get_request
end
def get_request
"I'm a request!"
end
end
# Somewhere in your source files:
class Network
def print_request
request = old_get_request
puts request
request
end
alias :old_get_request :get_request
alias :get_request :print_request
end
Imagine the first class definition is in the ActiveRecord source files. The second class definition is in your application somewhere.
$ irb -r openclasses.rb
>> Network.new.get
I'm a request!
=> "I'm a request!"
You can see that it prints it and then returns it. Neat, huh?
(And although my simple example doesn't use it since it isn't using Rails, check out alias_method_chain to combine your alias calls.)
I like Wireshark because you can start it listening on the web browser client end (usually your development machine) and then do a page request. Then you can find the HTTP packets, right click and "Follow Conversation" to see the HTTP with headers going back and forth.
This only works if you also control the server:
Follow the server log and fish out the URL that was called:
Completed in 0.26889 (3 reqs/sec) | Rendering: 0.00036 (0%) | DB: 0.02424 (9%) | 200 OK [http://localhost/notifications/summary.xml?person_id=25738]
and then open that in Firefox. If the server is truely RESTful (ie. stateless) you will get the same response as ARes did.
Or my method of getting into things when I don't know the exact internals is literally just to throw in a "debugger" statement, start up the server using "script/server --debugger" and then step through the code until I'm at the place I want, then start some inspecting right there in IRB.....that might help (hey Luke btw)
Maybe the best way is to use a traffic sniffer.
(Which would totally work...except in my case the traffic I want to see is encrypted. D'oh!)
I'd use TCPFlow here to watch the traffic going over the wire, rather than patching my app to output it.
the firefox plugin live http headers (http://livehttpheaders.mozdev.org/) is great for this. Or you can use a website tool like http://www.httpviewer.net/

Resources