Why do so many Ruby on Rails apps have missing trailing slashes? - ruby-on-rails

Why do so many Ruby on Rails apps have missing trailing slashes in their URLs? One example is http://basecamphq.com/tour. AFAIK this goes against Web standards. Is it something to do with the way RoR is set up?

It's not against Web standards. http://basecamphq.com/tour is considered a file, http://basecamphq.com/tour/ would be a directory (Note: both URLs aren't equal, although some webservers - e.g. Apache - will check the other if one doesn't exist). As both are kind of virtual, it's mainly up to the developer to decide (this is independent of used programming languages or frameworks).
I don't think it has something to do with caching (as mentioned by nilamo) as there are enough HTTP headers for cache control - might be that some reverse proxies have different default behavior though.

Your argument is invalid:
w3c's url spec doesn't enforce trailing slashes on urls.
This is what it says about slashes:
The path is interpreted in a manner
dependent on the scheme being used.
Generally, the reserved slash "/"
character (ASCII 2F hex) denotes a
level in a hierarchical structure, the
higher level part to the left of
the slash.
Rails adheres quite well to this directive.
My hair is a bird!

Because trailing slash denotes a directory, and you are not accessing directories in Rails, but pages. It's like tour.html in your example, except that .html can be ignored as it is the default.

I'd venture to say that since in RoR, the URL you type usually does not map to a static file in a directory, but is resolved dynamically by the routes.rb file, ending the path with a trailing slash doesn't make much sense.

Some like slashes, some don't. Impassioned arguments can be made for both sides.

Rails uses slashes as parameter token separators, and a route like
/post/:year/:page
matches by default both, /post/2012/a-title and /post/2012/a-title/, unless you do some magic. This has nothing to do with web standards.
From the point of view of the browser, these two paths are very different when it comes to deal with relative resources. In a response to the above with <img src="image.png"/> the browser will send a second query to the server for: /post/2012/image.png (first case) or /post/2012/a-title/image.png (second case), because the browser uses the trailing slash to resolve paths as if they were directories.
However, Rails developers usually don't care because they don't write URLs explicitly when rendering content! They have at their disposal URL helpers which hide this logic from them... unless you don't use the helpers to generate content, then you care.

This is a form of URL Re-writing. It is not against web standard and actually does a lot for usability and has been proven to help your search engine rankings. Think of it this way.
You are telling your friend about this cool post you seen on someone's blog. Which URL is easier to tell your friend:
http://www.coolwebsite.com/post.aspx?id=aebe6ca7-6c65-4b5c-bac8-9849faa0a467
OR
http://www.coolwebsite.com/cool-ideas-for-posts/

Related

Kentico 8 multilanguage prefixes for alternate languages only & not root

I'm posting this on behalf of a client and am unfamiliar with Kentico and .NET so please bear with me.
The issue
Our client has a website in two languages, let's say English and German.
URLs are currently outputting like this:
example.com
example.com/home (when they try to redirect this to the root,
they get a loop)
example.com?lang=de
example.com/home?lang=de
example.com/cat-l1/page
example.com/cat-l2/page?lang=de
Even with canonicalization, the above is very untidy and bad for SEO purposes.
My client has tried to implement multilingual prefixes for URLs in Kentico 8, but wound up with something like:
example.com/en
example.com/de
example.com/en/page
example.com/de/seite
This is better, but I neither want to redirect the root domain nor have the superfluous /en/ subdirectory.
I've gone through Kentico support forums and tried to source documentation, but this information doesn't appear to be readily available.
What I require
I would like to use multilingual prefixes ONLY for alternate languages (not the default). For instance:
example.com/
example.com/de
example.com/page
example.com/de/seite
Can someone please let me know:
What CMS settings need to be set in order to get the required URL
structure to work?
If some kind of custom URL rewriting handler
is required, what needs to be done? (I'll update this as I go, but don't even know where to look/start)
If understand correctly you want default (English) with no prefix, and other langs with prefix. You can do it purely with Kenitco settings (I had the same set up on one of my web sites). Go to settings-> URLs and SEO -> SEO - Cultures.
You need to check the last 2 (Use language prefix for URLs and Allow URLs without language prefixes). Here is the documentation Also take a look at how to configure prefixes
Also make sure that in sites -> your site name -> culture Default content culture is set to English. I know you can do with routes, i.e. you will have /home for english and /de/home for German, but I don't think you can do it with standard URLs. Essentially you will have to switch to routes the whole site (if your site is not big you can do it manually).
P.S. When you adding a new route you need to restart the app in order for route to work (especially in 8).

Host multiple Rails apps on the same server

I'm trying to host multiple rails apps for my blog. Kind of like www.blog.com/app1 will have one rails app, www.blog.com/app2 will have another. How do I do it?
Although I agree with downvotes as pointed out by the first comment, I had this problem myself several months ago and actually didn’t even try to solve it as I realized how many implications this has. Existing answers on Stack Overflow address either slightly different or narrower issue so they may use some things mentioned here but don’t elaborate on implications or alternatives, yet there’s an interesting overview (and also other answer to that question). Anyway, I took it as a challenge and dived in.
First, there are multiple approaches depending on your scenario:
All applications are code which you maintain – it’s probably the best to explore something called engines. They are like mini RoR applications mountable to certain path within normal RoR application. It has many benefits like sharing the same runtime or simple isolation configured in on place.
If there are no AJAX with URL or similar dynamisms or that they are actually AHAH (i.e., asynchronous HTML and HTTP – returning HTML fragments instead of XML or JSON data) which is very natural for Rails although often not used, you can use sophisticated proxy modules like mod_proxy_html which rewrite links inside HTML documents while proxying. Similar modules exist for nginx but are not part of standard distribution.
RoR has a configuration option relative_url_root which allows deployment to subdirectories. It’s very fragile and often buggy, many gems or engines break when you use it, so beware. When you get it right, it looks like magic. However, your configuration relating to subdirectory will be scattered throughout different software configs and your code.
I created an example repository while exploring the last option. README should say everything necessary to run the code.
The most important observation from this small project is that when using relative URL root, you almost certainly want to scope all your routes. There are different setups possible, but they are even more complicated (which doesn’t mean they don’t make sense). For examples see the answer with overview mentioned above.
By default (without scoped routes), only asset paths are prefixed with relative URL root, but not action route paths even though it makes URLs generated by helpers useless unless translated by mod_proxy_html or probably more custom solution.
Other important observation, which relates to official guide, code “out there” and answers to similar questions here on Stack Overflow, is that it’s good to avoid forward slash at the beginning of relative URL root. It behaves inconsistently between tests and the rest of the code. Yet it can be used nicely around your code – see scope definition in routes config or dummy controller test case.
I got to these and other observations by creating two very simple and almost identical Rails 5.2 applications. Each has one action (dummy#action) which has a route scoped to relative URL root. This action, or its view specifically does two important things to verify that everything works:
it outputs the result of calling root_path helper which shows we have correctly setup URL/path helpers (thanks to scoped route in config/routes.rb)
it loads static asset which isn’t served by Rails application but directly by Apache HTTP Server and which is referenced by image_path helper
You can see that virtual host configuration has rather extensive list of URLs which shouldn’t be passed via proxy and rely on aliased directories. However, this is application specific and very configurable, so simpler setup with different directory layout is definitely achievable but entirely separate topic.
If you like Passenger and don’t want to use proxying in your HTTP server, you can find more information in their deployment tutorial.

Is it safe to depend on a trailing slash in a URL for routing purposes?

I'm building a site that has products, each of which belongs to one or more categories, which can be nested within parent categories. I'd like to have SEO-friendly URLs, which look like this:
mysite.com/category/
mysite.com/category/product
mysite.com/category/sub-category/
mysite.com/category/sub-category/product
My question is: Is it safe to depend on a the presence of a trailing slash to differentiate between cases 2 and 3? Can I always assume the user wants a category index when a trailing slash is detected, vs a specific product's page with no trailing slash?
I'm not worried about implementing this URI scheme; I've already done as much with PHP and mod_rewrite. I'm simply wondering if anybody knows of any objections to this kind of URL routing. Are there any known issues with browsers stripping/adding trailing URLs from the address bar, or with search engines crawling such a site? Any SEO issues or other stumbling blocks that I'm likely to run into?
In addition to the other pitfall ideas you mentioned, the user might himself change the URL (by typing the product or category) and add/remove the trailing "/".
To solve your problem, why not have a special sub-category "all" and instead of
"mysite.com/category/product" have "mysite.com/category/all/product"?
To me, it seems very unnatural that http://product/ and http://product would represent two entirely different resources. It is confusing, and it makes your URLs less hackable, since it is difficult to tell when a trailing slash should be present or not.
Also, in RFC 3986, Uniform Resource Identifier (URI): Generic Syntax, there is a note on Protocol-Based Normalization in chapter 6.2.4, which talks about this particular situation with regard to non-human visitors of your site, such as search engines and web spiders:
Substantial effort to reduce the incidence of false negatives is
often cost-effective for web spiders. Therefore, they implement
even more aggressive techniques in URI comparison. For example,
if they observe that a URI such as
http://example.com/data
redirects to a URI differing only in the trailing slash
http://example.com/data/
they will likely regard the two as equivalent in the future. (...)
One way to differentiate would be to make sure product pages have an extension, but category or sub-category pages to not. That is:
mysite.com/category/
mysite.com/category/product.html
mysite.com/category/sub-category/
mysite.com/category/sub-category/product.html
That makes it unambiguous.
Never assume the user will do anything BUT the worst case scenario in anything URL related.
unless you're prepared to do redirects in your code, assume you have the equal chance of a URI ending in slash or no slash. Only way to make sure your code is robust and thus won't have to worry about this kind of issue.
This question assumes that the addition of a trailing slash to a URL creates a URL that refers to a different resource. This is wrong; the semantics of URLs is that they both refer to the same resource. The presence of a trailing slash in a base URL merely changes how relative URLs are interpreted using that base URL.

Django, Rails Routing...Point?

I'm a student of web development (and college), so my apologies if this comes off sounding naive and offensive, I certainly don't mean it that way. My experience has been with PHP and with a smallish project on the horizon (a glorified shift calendar) I hoped to learn one of the higher level frameworks to ease the code burden. So far, I looked at CakePHP Symfony Django and Rails.
With PHP, the URLs mapped very simply to the files, and it "just worked". It was quick for the server, and intuitive. But with all of these frameworks, there is this inclination to "pretty up" the URLs by making them map to different functions and route the parameters to different variables in different files.
"The Rails Way" book that I'm reading admits that this is dog slow and is the cause of most performance pains on largish projects. My question is "why have it in the first place?"? Is there a specific point in the url-maps-to-a-file paradigm (or mod_rewrite to a single file) that necessitates regexes and complicated routing schemes? Am I missing out on something by not using them?
Thanks in advance!
URLs should be easy to remember and say. And the user should know what to expect when she see that URL. Mapping URL directly to file doesn't always allow that.
You might want to use diffrent URLs for the same, or at least similar, information displayed. If your server forces you to use 1 url <-> 1 file mapping, you need to create additional files with all their function being to redirect to other file. Or you use stuff like mod_rewrite which isn't easier then Rails' url mappings.
In one of my applications I use URL that looks like http://www.example.com/username/some additional stuff/. This can be also made with mod_rewrite, but at least for me it's easier to configure urls in django project then in every apache instance I run application at.
just my 2 cents...
Most of it has already been covered, but nobody has mentioned SEO yet. Google puts alot of weight on the URL itself, if that url is widgets.com/browse.php?17, that is not very SEO friendly. If your URL is widgets.com/products/buttons/ that will have a positive impact on your page rank for buttons
Storing application code in the document tree of the web server is a security concern.
a misconfiguration might accidentally reveal source code to visitors
files injected through a security vulnerability are immediately executable by HTTP requests
backup files (created e.g. by text editors) may reveal code or be executable in case of misconfiguration
old files which the administrator has failed to delete can reveal unintended functionality
requests to library files must be explicitly denied
URLs reveal implementation details (which language/framework was used)
Note that all of the above are not a problem as long as other things don't go wrong (and some of these mistakes would be serious even alone). But something always goes wrong, and extra lines of defense are good to have.
Django URLs are also very customizable. With PHP frameworks like Code Igniter (I'm not sure about Rails) your forced into the /class/method/extra/ URL structure. While this may be good for small projects and apps, as soon as you try and make it larger/more dynamic you run into problems and have to rewrite some of the framework code to handle it.
Also, routers are like mod_rewrite, but much more flexible. They are not regular expression-bound, and thus, have more options for different types of routes.
Depends on how big your application is. We've got a fairly large app (50+ models) and it isn't causing us any problems. When it does, we'll worry about it then.

URL Etiquette: can all my urls end with .php?

Given my new understanding of the power of "includes" with PHP, it is my guess that ALL of my pages on my site will be .php extension.
Would this be considered strange?
I used to think that most pages would be .htm or .html, but in looking around the net, I am noticing that there really isn't any "standard".
I don't really think I have a choice, if I want to call my menus from a php file. It is just going to be that way, far as I can see... so just bouncing off you all to get a feel for what "real programmers" feel about such issues.
The thing that actually matters to the browser isn't the file's extension; it's the MIME Type that it gets sent in the HTTP headers. Headers are data that gets sent before the actual file and tell what kind of data it is, how big it is, and a bunch of other unimportant junk. You can configure your server to send any file extension as an HTML page, but the most common extensions for HTML pages are .htm, .html, .php, .asp, .aspx, .shtml, .jsp, and several others.
As for it looking "strange", a surprisingly small number of users will actually look at the address bar at all, let alone notice that the file extension is .php instead of .html. I wouldn't worry about it if I were you; it really doesn't make a difference.
generally - make sure your URLs are easily read, reflect the content beneath them, and don't change. the "not changing" part can be tricky, especially when you shift technologies over time (html>php>aspx).
to achieve this just ensure that each area of your site appears to reside in its own subdirectory.
mysite.com/news/
mysite.com/aboutus/
mysite.com/products/
etc.
you can either do this by physically structuring your site in this fashion and using default documents (default.html/php/aspx), or using something like mod rewrite, ISAPI rewrite, or similar to rewrite these paths to the appropriate docs.
someone who's keen on SEO or marketing might have a different idea about what constitutes a "good" URL, but as a developer this is how i see it.
Ending URLs in .php is fine technically, but I think these days many people are trying to make the urls independent of the actual code/file structure.
I actually think that's a good thing from a software engineering perspective as well. URLs are conceptually different (read: not related at all) to the file/directory structure used to organize the system powering the website.
The "resource" that a URL "locates" is not the .php or .asp file that contains the code to display it.
Look at stackoverflow for example, the URL of this question is /questions/322944/uql-etiquette, there's nothing in it that can be used to "guess" the underlying framework/system. The resource in this case is the question and all the answers to it, as well as the comments, votes, edits, and various other stuff.
It doesn't matter what your URLs end with, .php is fine, and fairly common. The only thing people care about these days when it comes to URLs is making them pretty for Search Engine Optimisation, but that's a whole new question.
Real programmers use URLs like /noun/verb/id/ & don't show file extensions at all :p
Personally I use Apache's mod-rewrite.
(on a slightly less tongue-in-cheek note) It's worth mentioning, specifically for includes, that you should ensure your actual files have the extension .php. I've seen more than one site where programming logic can be viewed in-browser 'cos the developer ended their files .inc (or insert non-auto-parsed extension of choice here).
As far as url etiquette goes - I really don't think etiquette is involved; however if you have sophisticated users visiting your website who have strong views on platforms and technologies, using .php or .aspx extensions could put off users - perhaps subconsciously.
If you use apache, it's fairly easy to make a .php be read as a .py and vice versa by changing the httpd.conf file. My current practice is to use .html extensions (or no extensions at all) and treat all files as .php.
Whatever you decide, do make sure that you never break an existing url. It's possible to achieve that even if you keep .php as the extension and decide to change the technology later.

Resources