How does finger print digest gets calculated in Rails 4.2 - ruby-on-rails

I am using Rails 4.2 and the document states that the fingerprint is an md5 digest calculated based on the content of the compiled file.
If we take a file lets say application-4c697a2e67b1a921abbdc1f753c465d8.js, 4c697a2e67b1a921abbdc1f753c465d8 is the md5 digest. The problem is that we are never able to get the same value by generating md5 from the content of the same file.
I have read somewhere that this fingerprint is not only based on the file but also affected by the environment along with the version of sprockets.
Can someone explain or list down the things (apart from the content of the file) that are used to generate this fingerprint ? And if someone can add a reference from rails sprockets repo (preferably sprockets 2.12.5) that would be very helpful.

The digest seems to be built here: https://github.com/rails/sprockets/blob/master/lib/sprockets/digest_utils.rb
Looks like there's a lot of logic in there, but that's where to find the answer.
It appears that the actual hash is created by calling ADD_VALUE_TO_DIGEST[obj.class].call(obj, digest) in the build_digest method.
Good question; I learned something while looking this up.

This is true for Rails 4.2.x not sure about other versions
There are three parts (concatenated in the same order) involved in generating an md5 against a file.
Sprockets::VERSION.
Rails.application.assets.version that is generated here (https://github.com/rails/sprockets-rails/blob/2.x/lib/sprockets/railtie.rb#L91).
Compiled file content.
The actual digest calculation in sprockets 2.x (for bundled assets) is being done here BundledAsset#L30

Related

True Paperclip Replacement (Speficially Structure of the File System)

With Rails 6, I need to replace Paperclip, but I can't find any substitutions that actually easily replicate it.
Specifically, the file structure paperclip used:
:model/:attachmant_field/000/000/000/:identifier/:style/:original_file_name
Over the last decade we have built several tools that rely on that structure (or something similar) and in addition our users expect that after uploading an image, they can reference the styles with the same file name and a permanent url (not a randomly generated name like ActiveStorage and Shrine does) and change the "style" component in the url to a different one in their html.
I've spent several days both on Shrine and ActiveStorage working to get the file structure and naming to work on and keep failing, as despite being "natural replacements" they don't actually handle things in the same way.
Our end system is on Amazon S3, though integrating with that hasn't been the issue, just the file system.
Thanks for your help, it's been really frustrating having to remove something that works great when there seems to be nothing that actually replaces it, if you want/need things done in the same way. I'd rather not have to start rewriting all of tools that we developed and resetting our customers expectations to work with a new structure.
Thanks so much.
Have you tried Carrierwave? You can specify any storage path and build it dynamically using model name (model.class.to_s.underscore), attachment field (mounted_as), model id (model.id). The original file name is also available as original_filename.

Storing data inside a ruby gem, where / how to write files?

I've been working on a Ruby parser, that fetches data from different API sources, and compile this data into a clear read-to-use JSON file.
For my use case, i need to store the data i'm initially fetching from the different sources as i don't want to fetch them each time I use the code.
For now i'm writing the JSON i'm receiving from the API sources locally into different JSON files stored in a data folder where my ruby script is. Then i read those files again, parse them and generate my new formatted JSON file that i'm gonna use later in a Rails app.
For that matter i want to create a Gem from this ruby script, which i'm currently working on. Nevertheless i'm not sure to fully understand how and where i should store that data (the one i'm fetching and the one i'm generating).
For now i have tried to simply keep the code as is and simply try to write the file like so:
URI.open("path/to/where/i/wanna/store/file.json", "wb") do |file|
file << URI.open(fetched_data_url).read
end
But wherever i try to write the data i get a :
No such file or directory # rb_sysopen path/to/where/i/wanna/store/file.json
Which in a way does not surprise me that much as i expected it to work in different ways in the context of a Gem. But i'm still missing something here about how to handle this. I'm not sure to fully understand how that all works, especially when you use paths in a gem that will ultimately be used in a rails project.
So several questions here:
Whenever you use a path to write a file inside a Gem, is that path relative to the gem or to the project that will ultimately use that Gem? (and consequently will the file be written inside the project that uses the Gem?)
In that precise use case here, what should i do about it? Where and how do i store my data so that i can use it later? knowing that i need to store it as a JSON file and that for now any attempt of writing a file ends up with an error.
Any input on what i'm misunderstanding here would be much appreciated ! Thanks !
Whenever you use a path to write a file inside a Gem, is that path relative to the gem or to the project that will ultimately use that Gem?
There is nothing special about using file paths whether the code is part of a Gem or not.
path/to/where/i/wanna/store/file.json is a relative path, which means it is looked up relative to the current working directory of the user who started the script. That's nothing special about Gems, that's not even anything to do with Ruby. That is just how file paths work. Relative paths are relative to the current working directory, absolute paths are not.
Where and how do i store my data so that i can use it later?
This depends largely on the Operating System Environment. Different OS Environments have different conventions where to store what kind of files. E.g. your files look like they fit the definition of a cache and Windows has a dedicated folder for caches, as does macOS, as do Linux distributions that follow the Linux Standard Base, as do Desktop Environments that follow the Free Desktop Standards, as does Android, as does iOS, …
For example, the Free Desktop Group has the XDG Base Directory Specification, which defines directories for application state, application data, application cache, and many other things for XDG-compliant environments. Microsoft has similar specifications for Windows. The LSB has something to say as well.

Detecting stale asset with sprockets?

In sprockets how do I detect if an asset is stale?
I've tried the following and my results were unexpected:
e = Rails.application.assets # sprockets env
x = Rails.application.assets.index
e['path/to/my/asset'].body
#=> prints asset
e['path/to/my/asset'].fresh?(x)
#=> true
# modify the asset file (to change mtime and digest)
e['path/to/my/asset'].fresh?(x)
#=> true
#!? Why wasn't that false?
The caching mechanism confuses me. Further, when inspecting the asset it tells me that the mtime is the original value, not time I modified the file above. Can someone explain what's going on here and how I can detect a stale asset? My hope is to leverage the sprockets dependency/caching system in my gem.
My Goal:
I'm creating a gem that finds assets in the pipeline and generates some content from them. This gem integrates with ActionView which complicates things by doing its own caching. I need some way to bust ActionView's cache if the asset in sprockets is stale and will be reloaded on next fetch. Rather than mirror sprocket's caching system in my gem, I was hoping to just ask sprockets about the state of its assets - which seems totally possible, if only I could figure out what was going on.
I can't answer how to get around this, but I can (with some confidence) tell you why its behaving as it is. Sprockets doesn't only mtime to determine staleness, it uses a digest of the file itself. It will return the asset as fresh first if the mtime hasn't been recently updated, and secondly if the hash digest is unchanged (for the relevant method, look here for dependency_fresh?). Since a touch won't change the hash of a file, Sprockets will consider it fresh.
I don't know quite what you're goal is here, so I can't give much advice here. The dependency tracking that is used is mostly private, but it would be possible to hack around this to force a reset. If what you're looking for is a quick way to force an asset to be stale for the sake of local testing while developing a gem, you may consider creating a monkey patch for Asset or ProcessedAsset that can flush the old mtime and digest values.
EDIT - I've done a bit more digging, and I think I found some useful things. The index method on assets creates a new object on each call, and is in effect a snapshot of the assets at the time, while the environment will constantly refresh when you ask it for an asset - looking up an asset causes it to automatically refresh that asset if it is stale.
In theory, this should have a surprisingly easy solution - just invert your fresh calls to x['path/to/asset'].fresh?(e).

Sync-ing files in Rails repo to S3

I am thinking of implementing a rake task that would sync certain files in my repository to S3. The catch is that I only want to update the files when they are changed in my Repo. So if file A gets modified and B stays the same, only file A will be synchronized to S3 during my next app deploy.
What is a reliable way to determine that a file has been modified? I am thinking of using git to determine whether the file has been changed locally.... is there any other way to do this? Does S3 provide similar functionality to this?
S3 does not presently support conditional PUTs, which would be the ideal solution, but you can get this behavior with two requests instead. Your sync operation would look something like:
For each file that you want on S3:
Calculate the MD5 of the local file.
Issue a HEAD request for that S3 object.
Issue a PUT request if the object's Content-MD5 differs or the object does not exist.
That said, this sounds a lot like something you'd do with assets, in which case you'd be reinventing the wheel. The Rails 3 asset pipeline addresses this problem well -- in particular, fingerprinting assets and putting the hash in the URL allows you to serve them with insanely long max-age values since they're immutable -- and the asset_sync gem can already put your assets on S3 automatically.
What about deleted files? The easy way to do it is to blast the whole directory with latest.

Rails' Paperclip gem POSTing instead of PUTting when uploading .zip file

I've got a form (Rails 3.2.8, Paperclip 3.1.4) with two Paperclip attachments for a model with two has_attached_files. One is meant to be an image, the other a generic file, usually a .zip file.
Everything works fine so long as I don't try to upload a .zip file. Uploading a .zip file of any size (original was 80 MB but tried 3 MB to see if it was a size issue) causes the form to POST instead of PUT and Rails throws a routing error.
The form method is POST but has the Rails' hidden _method value set to 'put', which works fine and does cause a PUT when I'm not trying to upload .zip files.
The form does have the enctype 'multipart' bit set correctly.
Any idea what could be causing this?
The file sounds large. Double check that the actual params are making it into the request. I get this on local as well depending on the size of the files.
The effect I've seen is that rails would basically get no params. Since a PUT is actually a post with a hidden element, rails would see only the POST since params are dropped.
I am actually not sure what is causing this. I think it may be the local webserver, so you may need to configure nginx or something. This never happens to me on heroku or anything, but always on local if the file is big enough.
Also note, webrick has a really really small size of the request payload limitation. So don't use that. Use "thin" as it is a really easy replacement.

Resources