Removing characters from a URL field - ruby-on-rails

I have a user profile with an url field that shows their website. Right now it displays it like this: http://www.userwebsite.com
How can I remove the "http://www." part on my show.html.erb file, when displaying the user profile?

You can either play string games or you can use the URI module:
require 'uri'
url = URI.parse("http://www.userwebsite.com")
url.host.split(".")[-2..-1].join(".")
The advantage of doing it this way is that you know that you've only got the host at this point, not the scheme or any other noise, such as the post-host path, etc.
It's probably easier to just split the URL and do this stuff but you'll have more error handling and special case handling that way.

Related

Twitter share api doesnt get url with two # signs

want to share text and url like that:
test http://one#two#three
so i try
https://twitter.com/intent/tweet?text=test&url=http%3A%2F%2Fone%23two%23three
and get in result only "test" without the url
when in url is only one # sigh is ok
for example
https://twitter.com/intent/tweet?text=test&url=http%3A%2F%2Fone%23two
give me
test http://one#two
how to add sec #?
This is almost certainly because it is not correct syntax to have more than one # in a URL.
What sort of URL are you sharing that needs two?
If you absolutely have to, you can cheat and URL Encode the second % into %25
https://twitter.com/intent/tweet?text=test&url=http%3A%2F%2Fone%23two%2523two

URL shared through facebook is altered when it includes empty square brackets

I have a facebook like button, with layout button_count which works perfectly in most situations but when the URL being shared contains [] it is altering that part of the URL to be [0].
I see the same behaviour with the Open Graph Object Debugger. If I enter the URL http://example.com?test%5B%5D=1 it gets transformed to http://example.com/?test%5B0%5D=1 (note the zero that has crept in).
I have tried with the initial URL both escaped and un-escaped with the same results.
I have tried configuring the like button with an href attribute and a data-href attribute and setting the og:url manually all to have escaped and unescaped versions of the URL but it insists on adding the 0 between empty square brackets.
The application that is consuming the URL is a rails application which translates the request parameters to "test" => { "0" => "1" } whereas my application is expecting "test" => ["1"].
Any ideas how to get facebook to leave the URL as it is? (I really don't want to have to parse the parameters a different way just for facebook requests - other social media sharing options work okay and leave the URL alone).

How can you get the canonical URL for a web page (Rails)?

I need to store a distinct URL for an external webpage
I need to put the URL into the database. I don't want to store the same page twice so
I need to strip all fluff off the URL.
# if I have
url_1 = "http://scientificamerican.com/royal-baby/?utm_campaign=promo"
# and
url_2 = "http://scientificamerican.com/royal-baby/?utm_source=email"
# then they should map to:
url_canonical = "http://scientificamerican.com/royal-baby/"
...it's not as simple as just stripping query parameters though
In order to get a single canonical URL regardless of what was on it I tried stripping the query string. The problem is that there are still CMSs which use the query string.
e.g.
url_1 = "https://www.scientificamerican.com/article.cfm?id=obama-budget"
# strip the query string and it becomes
url_1 = "https://www.scientificamerican.com/article.cfm"
# which is obviously the same for all articles :(
Is there any Rails tool for getting a page's canonical URL?
This is obviously a problem that a number of people have had to solve, not least the search engines. How do you reduce the URL down such that all that remains is the data for the page?
You can't. There is no way to know what query parameters are necessary to distinguish the URL. There are obviously many parameters you can knowingly remove (ie. utm_campaign, etc.) but not all.
You're best bet would be to load the HTML for the page and look for the canonical link element . If that exists, then you've got your canonical URL.
http://en.wikipedia.org/wiki/Canonical_link_element

How does one escape the # sign in a Url pattern in UrlMappings.groovy?

In order to maintain the current set of Urls in a project, I have to be able to use the # (pound sign) in the Url. For some reason the pound sign does not appear to work normally in this project for UrlMappings.groovy.
Is there a special escape-sequence that must be used when placing # signs in UrlMappings.groovy?
Am I missing some reason why one cannot use pound signs at all?
In the following URL Mapping example, the browser goes to the correct page, but the pageName variable is null:
"/test/${urlName}#/overview"(controller:'test', action:'overview') {
pageName = "overview"
}
I thought everything after # in the url would be treated on the client side of the browsers where it tries to find a and scroll to that location.
If you dump the request containing the pound char, do you even see the data behind #?
I used a Named URL mapping and it works fine, no need to escape the "#" sign:
name test: "/#abc" (controller: 'test', action:'homepage')
EDIT: My above answer is wrong. In fact, it falls to a special case when homepage is the default action of the view.
Netbrain is right, the path after "#" will never be sent to server. In stead, I found that it's possible using "%23" instead of "#". Please take a look at here.
For example, instead of /test#/abc we should use /test%23/abc as URL mapping (both at client side & server side).

How do SO URLs self correct themselves if they are mistyped?

If an extra character (like a period, comma or a bracket or even alphabets) gets accidentally added to URL on the stackoverflow.com domain, a 404 error page is not thrown. Instead, URLs self correct themselves & the user is led to the relevant webpage.
For instance, the extra 4 letters I added to the end of a valid SO URL to demonstrate this would be automatically removed when you access the below URL -
https://stackoverflow.com/questions/194812/list-of-freely-available-programming-booksasdf
I guess this has something to do with ASP.NET MVC Routing. How is this feature implemented?
Well, this is quite simple to explain I guess, even without knowing the code behind it:
The text is just candy for search engines and people reading the URL:
This URL will work as well, with the complete text removed!
The only part really important is the question ID that's also embedded in the "path".
This is because EVERYTHING after http://stackoverflow.com/questions/194812 is ignored. It is just there to make the link, if posted somewhere, if more speaking.
Internally the URL is mapped to a handler, e.g., by a rewrite, that transforms into something like: http://stackoverflow.com/questions.php?id=194812 (just an example, don't know the correct internal URL)
This also makes the URL search engine friendly, besides being more readable to humans.

Resources