In Rails, how do I determine if two URLs are equal? - ruby-on-rails

If I have two URLs in Rails, (whether they be in string form or URI objects) what's the best way to determine if they are equal? It seems like a fairly simple problem, but I need the solution to work even if one of the URLs is relative and the other is absolute, or if one of the URLs has different parameters than the other.
I already looked at What is the best way in Rails to determine if two (or more) given URLs (as strings or hash options) are equal? (and several other questions), but the question was pretty old and the suggested solution doesn't work the way I need it to.

Provided you have url1 and url2 being some string containing a URL :
def is_same_controller_and_action?(url1, url2)
hash_url1 = Rails.application.routes.recognize_path(url1)
hash_url2 = Rails.application.routes.recognize_path(url2)
[:controller, :action].each do |key|
return false if hash_url1[key] != hash_url2[key]
end
return true
end

1) convert URL to canonical form
In my current project I am using addressable gem in order to do that:
def to_canonical(url)
uri = Addressable::URI.parse(url)
uri.scheme = "http" if uri.scheme.blank?
host = uri.host.sub(/\www\./, '') if uri.host.present?
path = (uri.path.present? && uri.host.blank?) ? uri.path.sub(/\www\./, '') : uri.path
uri.scheme.to_s + "://" + host.to_s + path.to_s
rescue Addressable::URI::InvalidURIError
nil
rescue URI::Error
nil
end
Example:
> to_canonical('www.example.com') => 'http://example.com'
> to_canonical('http://example.com') => 'http://example.com'
2) compare your URLs: canonical_url1 == canonical_url2
UPD:
Does it work with sub-domains? - No. I mean, we cannot say that translate.google.com and google.com are equal. Of course, you can modify it depending on your needs.

Checkout the addressable gem and specifically the normalize method (and its documentation), and the heuristic_parse method (and its documentation). I've used it in the past and found it to be very robust.
Addressable even handles URLs with unicode characters in them:
uri = Addressable::URI.parse("http://www.詹姆斯.com/")
uri.normalize
#=> #<Addressable::URI:0xc9a4c8 URI:http://www.xn--8ws00zhy3a.com/>

Related

Decoding a redirect url - Ruby on Rails

I'd like to use the URI or CGI libraries to get the path from the query part of this url. In other words, just: '/scouting/amateur'. Is this possible or do I need to use regexp?
http://10.241.180.63:3149/login?redirect_path=http%3A%2F%2F10.241.180.63%3A3149%2Fscouting%2Famateur
Suggestion with Ruby built-ins (if designing a method, you might want to implement some error handling).
require 'uri'
query = URI("http://10.241.180.63:3149/login?redirect_path=http%3A%2F%2F10.241.180.63%3A3149%2Fscouting%2Famateur").query
path = URI(URI.decode(query).split('=')[1]).path
You may find the gem uri-query_params helpful / more elegant (it will decode query params automatically). E.g.
require 'uri/query_params'
uri = URI("http://10.241.180.63:3149/login?redirect_path=http%3A%2F%2F10.241.180.63%3A3149%2Fscouting%2Famateur")
URI(uri.query_params["redirect_path"]).path
Try this -
url = URI.parse('http:://10.241.180.63:3149/login?redirect_path=http%3A%2F%2F10.241.180.63%3A3149%2Fscouting%2Famateur')
redirect_path = u.opaque.split("redirect_path=").last
# redirect_path is now {"redirect_path"=>["http://10.241.180.63:3149/scouting/amateur"]}
result = redirect_path.split("/").last(2).join("/")
# result = 'scouting/amateur'

Removing a part of a URL with Ruby

Removing the query string from a URL in Ruby could be done like this:
url.split('?')[0]
Where url is the complete URL including the query string (e.g. url = http://www.domain.extension/folder?schnoo=schnok&foo=bar).
Is there a faster way to do this, i.e. without using split, but rather using Rails?
edit: The goal is to redirect from http://www.domain.extension/folder?schnoo=schnok&foo=bar to http://www.domain.extension/folder.
EDIT: I used:
url = 'http://www.domain.extension/folder?schnoo=schnok&foo=bar'
parsed_url = URI.parse(url)
new_url = parsed_url.scheme+"://"+parsed_url.host+parsed_url.path
Easier to read and harder to screw up if you parse and set fragment & query to nil instead of rebuilding the URL.
parsed = URI::parse("http://www.domain.extension/folder?schnoo=schnok&foo=bar#frag")
parsed.fragment = parsed.query = nil
parsed.to_s
# => "http://www.domain.extension/folder"
url = 'http://www.domain.extension/folder?schnoo=schnok&foo=bar'
u = URI.parse(url)
p = CGI.parse(u.query)
# p is now {"schnoo"=>["schnok"], "foo"=>["bar"]}
Take a look on the : how to get query string from passed url in ruby on rails
You can gain performance using Regex
'http://www.domain.extension/folder?schnoo=schnok&foo=bar'[/[^\?]+/]
#=> "http://www.domain.extension/folder"
Probably no need to split the url. When you visit this link, you are pass two parameters to back-end:
http://www.domain.extension/folder?schnoo=schnok&foo=bar
params[:schnoo]=schnok
params[:foo]=bar
Try to monitor your log and you will see them, then you can use them in controller directly.

ActionController::Base.param_parsers Alternative

I have found several websites pointing to using the following code to add support for custom parameter formats:
ActionController::Base.param_parsers[Mime::PLIST] = lambda do |body|
str = StringIO.new(body)
plist = CFPropertyList::List.new({:data => str.string})
CFPropertyList.native_types(plist.value)
end
This one here is for the Apple plist format, which is what I am looking to do. However, using Rails 3.2.1, The dev server won't start, saying that param_parsers is undefined. I cannot seam to find any documentation for it being deprecated or any alternative to use, just that it is indeed included in the 2.x documentation and not the 3.x documentation.
Is there any other way in Rails 3 to support custom parameter formats in POST and PUT requests?
The params parsing moved to a Rack middleware. It is now part of ActionDispatch.
To register new parsers, you can either redeclare the use of the middleware like so:
MyRailsApp::Application.config.middleware.delete "ActionDispatch::ParamsParser"
MyRailsApp::Application.config.middleware.use(ActionDispatch::ParamsParser, {
Mime::PLIST => lambda do |body|
str = StringIO.new(body)
plist = CFPropertyList::List.new({:data => str.string})
CFPropertyList.native_types(plist.value)
end
})
or you can change the constant containing the default parsers like so
ActionDispatch::ParamsParser::DEFAULT_PARSERS[Mime::PLIST] = lambda do |body|
str = StringIO.new(body)
plist = CFPropertyList::List.new({:data => str.string})
CFPropertyList.native_types(plist.value)
end
The first variant is probably the cleanest. But you need to be aware that the last one to replace the middleware declaration wins there.

Ruby/Rails 3.1: Given a URL string, remove path

Given any valid HTTP/HTTPS string, I would like to parse/transform it such that the end result is exactly the root of the string.
So given URLs:
http://foo.example.com:8080/whatsit/foo.bar?x=y
https://example.net/
I would like the results:
http://foo.example.com:8080/
https://example.net/
I found the documentation for URI::Parser not super approachable.
My initial, naïve solution would be a simple regex like:
/\A(https?:\/\/[^\/]+\/)/
(That is: Match up to the first slash after the protocol.)
Thoughts & solutions welcome. And apologies if this is a duplicate, but my search results weren't relevant.
With URI::join:
require 'uri'
url = "http://foo.example.com:8080/whatsit/foo.bar?x=y"
baseurl = URI.join(url, "/").to_s
#=> "http://foo.example.com:8080/"
Use URI.parse and then set the path to an empty string and the query to nil:
require 'uri'
uri = URI.parse('http://foo.example.com:8080/whatsit/foo.bar?x=y')
uri.path = ''
uri.query = nil
cleaned = uri.to_s # http://foo.example.com:8080
Now you have your cleaned up version in cleaned. Taking out what you don't want is sometimes easier than only grabbing what you need.
If you only do uri.query = '' you'll end up with http://foo.example.com:8080? which probably isn't what you want.
You could use uri.split() and then put the parts back together...
WARNING: It's a little sloppy.
url = "http://example.com:9001/over-nine-thousand"
parts = uri.split(url)
puts "%s://%s:%s" % [parts[0], parts[2], parts[3]]
=> "http://example.com:9001"

What is involved with changing attachment_fu's storage scheme?

I have a rails application that is using attachment_fu. Currently, it is using :file_system for storage, but I want to change it to :s3, to allow for better scaling as more files get uploaded.
What is involved with this? I imagine that if I just switch the code to use :s3, all the old links will be broken. Do I need to just copy the existing files from the file system to S3? A google search hasn't turned up much on the topic.
I would prefer to move the existing files over to S3, so everything is in the same place, but if necessary, the old files can stay where they are, as long as new ones go to S3.
EDIT: So, it is not as simple as copying over the files to S3; the URLs are created using a different scheme. When they are stored in :file_system, the files end up in places like /public/photos/0000/0001/file.name, but the same file in :s3 might end up in 0/1/file.name. I think it is using the id something, and just padding it (or not) with zeros, but I'm not sure of that.
That's correct. The ids are padded using :file_system storage.
Instead of renaming all your files, you can alter the s3 backend module to use padded numbers as well.
Copy the partitioned_path method from file_system_backend.rb and put it in s3_backend.rb.
def partitioned_path(*args)
if respond_to?(:attachment_options) && attachment_options[:partition] == false
args
elsif attachment_options[:uuid_primary_key]
# Primary key is a 128-bit UUID in hex format. Split it into 2 components.
path_id = attachment_path_id.to_s
component1 = path_id[0..15] || "-"
component2 = path_id[16..-1] || "-"
[component1, component2] + args
else
path_id = attachment_path_id
if path_id.is_a?(Integer)
# Primary key is an integer. Split it after padding it with 0.
("%08d" % path_id).scan(/..../) + args
else
# Primary key is a String. Hash it, then split it into 4 components.
hash = Digest::SHA512.hexdigest(path_id.to_s)
[hash[0..31], hash[32..63], hash[64..95], hash[96..127]] + args
end
end
end
Modify s3_backend.rb's full_filename method to use the partitioned_path.
def full_filename(thumbnail = nil)
File.join(base_path, *partitioned_path(thumbnail_name_for(thumbnail)))
end
attachment_fu will now create paths with the same names as it did with the file_system backend, so you can just copy your files over to s3 without renaming everything.
In addition to nilbus' answer, I had to modify s3_backend.rb's base_path method to return an empty string, otherwise it would insert the attachment_path_id twice:
def base_path
return ''
end
What worked for me, in addition to nilbus's answer, was to modify s3_backend.rb's base_path method to still use the path_prefix (which is by default the table name):
def base_path
attachment_options[:path_prefix]
end
And also, I had to take the attachment_path_id from file_system_backend.rb and replace the one in s3_backend.rb, since otherwise partitioned_path always thought my Primary Key was a String:
def attachment_path_id
((respond_to?(:parent_id) && parent_id) || id) || 0
end
Thanks for all those responses which helped a lot. It worked for me too but I had to do this in order to have the :thumbnail_class option working :
def full_filename(thumbnail = nil)
prefix = (thumbnail ? thumbnail_class : self).attachment_options[:path_prefix].to_s
File.join(prefix, *partitioned_path(thumbnail_name_for(thumbnail)))
end

Resources