Detecting stale asset with sprockets? - ruby-on-rails

In sprockets how do I detect if an asset is stale?
I've tried the following and my results were unexpected:
e = Rails.application.assets # sprockets env
x = Rails.application.assets.index
e['path/to/my/asset'].body
#=> prints asset
e['path/to/my/asset'].fresh?(x)
#=> true
# modify the asset file (to change mtime and digest)
e['path/to/my/asset'].fresh?(x)
#=> true
#!? Why wasn't that false?
The caching mechanism confuses me. Further, when inspecting the asset it tells me that the mtime is the original value, not time I modified the file above. Can someone explain what's going on here and how I can detect a stale asset? My hope is to leverage the sprockets dependency/caching system in my gem.
My Goal:
I'm creating a gem that finds assets in the pipeline and generates some content from them. This gem integrates with ActionView which complicates things by doing its own caching. I need some way to bust ActionView's cache if the asset in sprockets is stale and will be reloaded on next fetch. Rather than mirror sprocket's caching system in my gem, I was hoping to just ask sprockets about the state of its assets - which seems totally possible, if only I could figure out what was going on.

I can't answer how to get around this, but I can (with some confidence) tell you why its behaving as it is. Sprockets doesn't only mtime to determine staleness, it uses a digest of the file itself. It will return the asset as fresh first if the mtime hasn't been recently updated, and secondly if the hash digest is unchanged (for the relevant method, look here for dependency_fresh?). Since a touch won't change the hash of a file, Sprockets will consider it fresh.
I don't know quite what you're goal is here, so I can't give much advice here. The dependency tracking that is used is mostly private, but it would be possible to hack around this to force a reset. If what you're looking for is a quick way to force an asset to be stale for the sake of local testing while developing a gem, you may consider creating a monkey patch for Asset or ProcessedAsset that can flush the old mtime and digest values.
EDIT - I've done a bit more digging, and I think I found some useful things. The index method on assets creates a new object on each call, and is in effect a snapshot of the assets at the time, while the environment will constantly refresh when you ask it for an asset - looking up an asset causes it to automatically refresh that asset if it is stale.
In theory, this should have a surprisingly easy solution - just invert your fresh calls to x['path/to/asset'].fresh?(e).

Related

How does finger print digest gets calculated in Rails 4.2

I am using Rails 4.2 and the document states that the fingerprint is an md5 digest calculated based on the content of the compiled file.
If we take a file lets say application-4c697a2e67b1a921abbdc1f753c465d8.js, 4c697a2e67b1a921abbdc1f753c465d8 is the md5 digest. The problem is that we are never able to get the same value by generating md5 from the content of the same file.
I have read somewhere that this fingerprint is not only based on the file but also affected by the environment along with the version of sprockets.
Can someone explain or list down the things (apart from the content of the file) that are used to generate this fingerprint ? And if someone can add a reference from rails sprockets repo (preferably sprockets 2.12.5) that would be very helpful.
The digest seems to be built here: https://github.com/rails/sprockets/blob/master/lib/sprockets/digest_utils.rb
Looks like there's a lot of logic in there, but that's where to find the answer.
It appears that the actual hash is created by calling ADD_VALUE_TO_DIGEST[obj.class].call(obj, digest) in the build_digest method.
Good question; I learned something while looking this up.
This is true for Rails 4.2.x not sure about other versions
There are three parts (concatenated in the same order) involved in generating an md5 against a file.
Sprockets::VERSION.
Rails.application.assets.version that is generated here (https://github.com/rails/sprockets-rails/blob/2.x/lib/sprockets/railtie.rb#L91).
Compiled file content.
The actual digest calculation in sprockets 2.x (for bundled assets) is being done here BundledAsset#L30

Sprockets cache not invalidated by file change

I'm running a test Ruby-on-Rails app using Webrick in a test environment. The automated end-to-end test accesses an admin page which causes a JavaScript file to be updated which is used by another admin page. The problem is that the second admin page does not see the update, but instead gets the old copy of the JavaScript file. I can see the changed file on the file system, but even if use curl from the command line, I still get the old version of the file. The test used to work (at least with Rails 4.0, if not 4.1). It is just now that I am trying to update to Rails 4.2 that this problem has arisen.
Is there something I can do to to tell Rails/Sprockets to forget its old cached copy of the JavaScript that was updated? I know when I am updating it, and wouldn't mind even resetting Sprocket's entire cache if I couldn't do it selectively. What I can't do is restart the server each time a JavaScript file gets updated.
I tried many approaches to making Sprockets "forget" its cached copy, but it seemed very determined to remember, and I began to have the sense that I was fighting against a firmly entrenched design decision. In the end I decided that, just for my generated-on-the-fly JavaScript files, I would avoid sprockets altogether, which meant handling the compiling and fingerprinting of the files myself, and writing my own version of javascript_include_tag just for those generated files.
For reference, the compiling and fingerprinting is actually fairly easy:
require "uglifier"
require "digest"
minified_content = Uglifier.compile(File.read(my_generated_js_file))
fingerprint = Digest::MD5.hexdigest(minified_content)
fingerprinted_file = File.basename(basename, '.js') + '-' + fingerprint + '.js'
(and then just write out the fingerprinted file to public/assets).

config.assets.compile=true in Rails production, why not?

The default Rails app installed by rails new has config.assets.compile = false in production.
And the ordinary way to do things is to run rake assets:precompile before deploying your app, to make sure all asset pipeline assets are compiled.
So what happens if I set config.assets.compile = true in production?
I wont' need to run precompile anymore. What I believe will happen is the first time an asset is requested, it will be compiled. This will be a performance hit that first time (and it means you generally need a js runtime in production to do it). But other than these downsides, after the asset was lazily compiled, I think all subsequent access to that asset will have no performance hit, the app's performance will be exactly the same as with precompiled assets after this initial first-hit lazy compilation. is this true?
Is there anything I'm missing? Any other reasons not to set config.assets.compile = true in production? If I've got a JS runtime in production, and am willing to take the tradeoff of degraded performance for the first access of an asset, in return for not having to run precompile, does this make sense?
I wrote that bit of the guide.
You definitely do not want to live compile in production.
When you have compile on, this is what happens:
Every request for a file in /assets is passed to Sprockets. On the first request for each and every asset it is compiled and cached in whatever Rails is using for cache (usually the filesystem).
On subsequent requests Sprockets receives the request and has to look up the fingerprinted filename, check that the file (image) or files(css and js) that make up the asset were not modified, and then if there is a cached version serve that.
That is everything in the assets folder and in any vendor/assets folders used by plugins.
That is a lot of overhead as, to be honest, the code is not optimized for speed.
This will have an impact on how fast asset go over the wire to the client, and will negatively impact the page load times of your site.
Compare with the default:
When assets are precompiled and compile is off, assets are compiled and fingerprinted to the public/assets. Sprockets returns a mapping table of the plain to fingerprinted filenames to Rails, and Rails writes this to the filesystem. The manifest file (YML in Rails 3 or JSON with a randomised name in Rails 4) is loaded into Memory by Rails at startup and cached for use by the asset helper methods.
This makes the generation of pages with the correct fingerprinted assets very fast, and the serving of the files themselves are web-server-from-the-filesystem fast. Both dramatically faster than live compiling.
To get the maximum advantage of the pipeline and fingerprinting, you need to set far-future headers on your web server, and enable gzip compression for js and css files. Sprockets writes gzipped versions of assets which you can set your server to use, removing the need for it to do so for each request.
This get assets out to the client as fast as possible, and in the smallest size possible, speeding up client-side display of the pages, and reducing (with far-future header) requests.
So if you are live compiling it is:
Very slow
Lacks compression
Will impact render time of pages
Versus
As fast as possible
Compressed
Remove compression overheard from server (optionally).
Minimize render time of pages.
Edit: (Answer to follow up comment)
The pipeline could be changed to precompile on the first request but there are some major roadblocks to doing so. The first is that there has to be a lookup table for fingerprinted names or the helper methods are too slow. Under a compile-on-demand senario there would need to be some way to append to the lookup table as each new asset is compiled or requested.
Also, someone would have to pay the price of slow asset delivery for an unknown period of time until all the assets are compiled and in place.
The default, where the price of compiling everything is paid off-line at one time, does not impact public visitors and ensures that everything works before things go live.
The deal-breaker is that it adds a lot of complexity to production systems.
[Edit, June 2015] If you are reading this because you are looking for a solution for slow compile times during a deploy, then you could consider precompiling the assets locally. Information on this is in the asset pipeline guide. This allows you to precompile locally only when there is a change, commit that, and then have a fast deploy with no precompile stage.
To have less overhead with Pre-compiling thing.
Precompile everything initially with these settings in production.rb
# Precompile *all* assets, except those that start with underscore
config.assets.precompile << /(^[^_\/]|\/[^_])[^\/]*$/
you can then simply use images and stylesheets as as "/assets/stylesheet.css" in *.html.erb
or "/assets/web.png"
For anyone using Heroku:
If you deploy to Herkou, it will do the precompile for you automatically during the deploy if compiled assets are not included (i.e. public/assets not committed) so no need for config.assets.compile = true, or to commit the precompiled assets.
Heroku's docs are here. A CDN is recommended to remove the load on the dyno resource.
It won't be the same as precompiling, even after that first hit: because the files aren't written to the filesystem they can't be served directly by the web server. Some ruby code will always be involved, even if it just reads a cache entry.
Set config.asset.compile = false
Add to your Gemfile
group :assets do
gem 'turbo-sprockets-rails3'
end
Install the bundle
Run rake assets:precompile
Then Start your server
From the official guide:
On the first request the assets are compiled and cached as outlined in development above, and the manifest names used in the helpers are altered to include the MD5 hash.
Sprockets also sets the Cache-Control HTTP header to max-age=31536000. This signals all caches between your server and the client browser that this content (the file served) can be cached for 1 year. The effect of this is to reduce the number of requests for this asset from your server; the asset has a good chance of being in the local browser cache or some intermediate cache.
This mode uses more memory, performs poorer than the default and is not recommended.
Also, precompile step is not trouble at all if you use Capistrano for your deploys. It takes care of it for you. You just run
cap deploy
or (depending on your setup)
cap production deploy
and you're all set. If you still don't use it, I highly recommend checking it out.
Because it is opening a directory traversal vulnerability - https://blog.heroku.com/rails-asset-pipeline-vulnerability

When are compiled assets being cached in rails

When I precompile my assets for a rails 3.1 app with rake assets:precompile it spits out an old cached version if nothing changes in the asset files. I can tell because my erb is making use of a constant that I was trying to change elsewhere in my app. One work around is to alter one of the css files (eg by adding a space etc) before re-precompiling but this is a pain and I would like to try and disable this caching if it is possible. Any ideas???
This is the expected behavior of the pipeline - the ERB is evaluated only once when you precompile. The value at compile time is the value you get in the file.
The caching is based on the checking the timestamp of the files. You could run Sprockets in production without precompiling (this is called live compiling), but you cannot turn off the caching because the performance would be dreadful - every single request would require Sprockets to recompile all the files.
Sorry :-(

Static asset caching on Heroku with Jammit by changing ActionController::Base#page_cache_directory

I'm attempting to use Jammit for packaging CSS and JS for a Rails app deployed on Heroku, which doesn't work out of the box due to Heroku's read only file system. Every example I've seen of how to do this recommends building all the packaged asset files in advance. Because of Heroku's Git-based deployment, this means you need to make a separate commit to your repository every time these files change, which is not an acceptable solution to me. Instead, I want to change the path that Jammit uses to write the cached packages to #{Rails.root}/tmp/assets (by changing ActionController::Base#page_cache_directory), which is writable on Heroku.
What I don't understand is how the cached files will be used without hitting the Rails stack every time, even using the default path for cached packages. Let me explain what I mean:
When you include a package using Jammit's helper, it looks something like this:
<%= include_javascripts :application %>
which generates this script tag:
<script src="/assets/application.js" type="text/javascript"></script>
When the browser requests this URL, what actually happens is that it gets routed to Jammit::Controller#package, which renders the contents of the package to the browser and then writes a cached copy to #{page_cache_directory}/assets/application.js. The idea is that this cached file is built on the first request, and subsequent requests should serve the cached file directly without hitting the Rails stack. I looked through the Jammit code and I don't see how this is supposed to happen. What prevents subsequent requests to /assets/application.js from simply routing to Jammit::Controller again and never using the cached file?
My guess is that there's a Rack middleware somewhere I'm not seeing that serves the file if it exists and forwards the request on to the controller if it doesn't. If that's the case, where is that code? And how would it work when changing ActionController::Base#page_cache_directory (effectively changing where Jammit writes cached packages)? Since #{Rails.root}/tmp is above the public document root, there's no URL that maps to that path.
Great question! I haven't set this up myself, but it's something I've been meaning to look into, so you've prompted me to do so. Here's what I would try (I'll give a shot myself soon, but you are probably going to beat me to it).
config.action_controller.page_cache_directory = "#{Rails.root}/tmp/page_cache"
Now change your config.ru to:
require ::File.expand_path('../config/environment', __FILE__)
run Rack::URLMap.new(
"/" => Your::App.new,
"/assets" => Rack::Directory.new("tmp/page_cache/assets"))
Just make sure not to have anything in public/assets, since that won't ever be picked up.
Notes:
This is for Rails 3. Not sure of the solution under Rails 2.
It looks like Rack::Directory sets cache control headers to 12 hours so Heroku will cache your assets to Varnish. Not sure if Jammit sets this in its controller, but even if it doesn't, it will be cached quite quickly.
Heroku also sets ENV['TMPDIR'] now as well, so you can use that instead of Rails.root + '/tmp' if you wish.
This might be of use, it's for a different gem but the idea is similar and I'm trying to get it working with the plain asset helpers.
http://devcenter.heroku.com/articles/using-compass
Unfortunately it seems to be quite difficult to get rails to do this without patching/rewriting the asset helpers module (which resembles coupled spaghetti).

Resources