Is it worth using "pretty URLs" if you don't care about SEO/SEM

Is it worth using "pretty URLs" if you don't care about SEO/SEM - ruby-on-rails

I'm designing a hosted software-as-a-service application that's like a highly specialized version of 37Signal's Highrise product. In that context, where SEO is a non-issue, is it worth implementing "pretty URLs" instead of going with numeric IDs (e.g. customers/john-smith instead of customers/1234)? I notice that a lot of web applications don't bother with them unless they provide a real value (e.g. e-commerce apps, blogs - things that need SEO to be found via search engines)

Depends on how often URLs are transmitted verbally by its users. People tend to find it relatively difficult to pronounce something like
http://www.domain.com/?id=4535&f=234&r=s%39fu__
and like
http://www.domain.com/john-doe
much better ;)

In addition to readability, another thing to keep in mind is that by exposing an auto-incrementing numeric key you also allow someone to guess the URLs for other resources and could give away certain details about your data. For instance, if someone signs up for your app and sees that their account is at /customer/12, it may effect their confidence in your application knowing that you only have 11 other customers. This wouldn't be an issue if they had a url of /customer/some-company.

It's always worth it if you just have the time to do it right.
Friendly-urls look a lot nicer and they give a better idea where the link will lead. This is useful if the link is shared eg. via instant message.
If you're searching for a specific page from browser history, human readable url helps.
Friendly url is a lot easier to remember (useful in some cases).
Like said earlier, it is also a lot easier to communicate verbally (needed more often than you'd think).
It hides unnecessary technical details from the user. In one case where user id was visible in the url, several users asked why their user id is higher than total amount of users. No damage done, but why have a confused user if you can avoid it.

I sure am a lot more likely to click on a link when I mouseover it, and it has http://www.example.com/something-i-am-interested-in.html.
Rather than seeing http://www.example.com/23847ozjo8uflidsa.asp.
It's quite annoying clicking links on MSDN because I never know what to expect I will get.

When I create applications I try my best to hide its structure from prying eyes - while it's subjective on how much "SEO" you get out of it - Pretty URLs tend to help people navigate and understand where they are while protecting your code from possible injections.
I notice you're using Rails app - so you probably wouldn't have a huge query string like in ASP, PHP, or those other languages - but in my opinion the added cleanliness and overall appearance is a plus for customer interaction. When sharing links it's nicer for customers to be able to copy the url: customer/john_doe than have to hunt for a "link me" or a random /customer/
Marco

I typically go with a combination -- keeping the ease of using Rails RESTful routing while still providing some extended information in URLs.
My app URLs look something like this:
http://example.com/discussions/123-is-it-worth-using-pretty-urls/
http://example.com/discussions/123-is-it-worth-using-pretty-urls/comments
http://example.com/discussions/123-is-it-worth-using-pretty-urls/comments/34567
You don't have to add ANY custom routes to pull this off, you just need to add the following method to your model:
def to_param
[ id, permalink ].join("-")
end
And ensure any find calling params[:id] in your controller is converted to an integer by setting params[:id].to_i.
Just a note, you'll need to set a permalink attribute when your record is saved...

If your application is restful, the URLs that rails gives you are SEO-friendly by default.
In your example, customers/1234 will probably return something like
<h1>Customer</h1>
<p><strong>Name:</strong> John Smith</p>
etc etc
Any current SEO spider will be smart enough to parse the destination page and extract that "John Smith" from there anyway.
So, in that sense, customers/1234 is already a "nice" URL (as opposed to other systems, in which you would have something like resource/123123/1234 for customer 1234 resource/23232/321 for client 321).
Now, if you want your users to be regularly using urls (like in delicious, etc) you might want to start using logins and readable fields instead of ids.
But for SEO, ids are just fine.

Related

How would I go about routing visitors based on location in rails?

I'm working on a rails app that needs to route users to a specific URL based on their location. Preferably something that will present them the appropriate content based on location with the ability for them to be able to view content for other locations.
Specifically, think of the location interface for Craigslist... Users are presented content from the city they are in and still allowed to select and view another city.
I've seen a few posts that answer parts of this question, but I'm trying to plan out the best solution.
It looks like there is going to need to be something, probably cookie based, that sets a 'default' location for a given user and still allows them to select other locations.
Again, just looking for concept/planning assistance and any direction on any gems that might be applicable.
Thanks in advance!

http://dev.maxmind.com/geoip/geolite is a free geo-ip database that works pretty well. It makes some mistakes (it put a client's office of mine in Kirkland, WA when they are in fact in downtown Seattle, WA). Certainly is good enough for Craigslist level specificity since you'd be re-routing both those people to "seattle" anyway. There's a ruby gem for it as well - "geoip-c". It's very easy to use.
The other option would be to use HTML5's "gimme your location" functionality. More intrusive for the user, but might be more specific.

We're dropping our beta tag. What will Google do about it?

Until now, our ASP.NET MVC site was accessible at http://beta.fleex.tv. Now that we dropped our beta label, it is located at http://fleex.tv.
We set up an http redirect from beta.fleex.tv to fleex.tv through our domain name registrar, 1&1. That redirect is pretty brutal: it doesn't look at the page consulted, just the domain, and will for instance redirect http://beta.fleex.tv/page?arg=0 straight to http://fleex.tv.
I have 2 questions:
Is there a simple way to redirect http://beta.fleex.tv/page?arg=0 to http://fleex.tv/page?arg=0? Is this a good idea, or should we instead delete beta.fleex.tv altogether?
What should we do with Google?
If we keep the 'beta' pages, what will happen to them in Google's index? With the current redirects in place they all point to http://fleex.tv. My guess is that Google will start detecting duplicate content (or even redirected content) and delete everything from the index, but I'd love to understand how things will go in more details
If we submit a new sitemap with all the fleex.tv nodes, will Google penalize us in any way or will it simply start indexing those pages from scratch, untouched by the beta.fleex.tv debacle?
Generally speaking I'd love to know what you guys think about what the best strategy might be here. This seems like a fairly common problem. I feel there's no way to avoid losing all the indexing that Google has done, in that case though I'd just like to know how this whole operation will affect our 'reputation' with Google...
Please shoot questions if this is unclear.

When it comes to SEO this is a disaster. You need to do page-level 301 redirects.
Having a different sub-domain was a mistake from the beginning because now it causes trouble and links pointing to you are now inaccurate.
Redirecting is not particularly hard to do: in Application_BeginRequest look at the Request.Host and Request.RawUrl properties and redirect if necessary.

I was going to give the same advice as usr, but to reiterate - this will absolutely murder your rankings and traffic from organic if you don't fix it, especially long tail ones.
You've got about a three-four week grace period to get page level stuff implemented before you start to lose value from the old URLs.
I would also add that because you're using 302s instead of 301s (very common in ASP.NET environments), Google will take much longer to view the redirects as permanent, repeatedly checking back to see if the pages have moved - and therefore not passing value from the subdomain to the main domain - and from personal experience will in the end pass less link value overall.
http://support.google.com/analytics/bin/answer.py?hl=en-GB&answer=2613318

How can I strip request values out of my Rails url?

In my Rails 3 application, I list many items on the homepage. Some of them are obscure, and I would like to limit my list to only popular items unless the user clicks a specific link that basically "zeroes out" the limiter.
What I have now works, but when the user chooses to "Show all items", I end up with a ugly url:
http://myapp.com/?limiter=0
Is there any way that I can strip that out so that the user does not see the ugly attribute at the end of the url?

No, don't use POST. POST is only supposed to be used when you are making a state change on the server. Use an AJAX GET if you really need to do this.
Better yet, get used to seeing GET parameters like this. It's normal. And, it's like that for a reason: it allows bookmarking a resource, including whatever settings are needed to reproduce the request later.
Read up on REST. Learn it. Live it. Love it.

There's a number of approaches you could take. Probably the most obvious one is to have a separate page for your show_all. It sounds like you're trying to do too much with your homepage.
If you must have these on the homepage, and your link is also on the homepage, you could use an ajax call to load up your items without having to redirect to that url.
Finally I suppose you could try making a route just for this situation. I don't really have any experience with Rails3 routes, though, so I can't suggest any syntax.
Really, though, this smells like an application design problem, not a technical problem. I strongly encourage you to rethink how you are trying to do this. This doesn't sound like a feature that is appropriate to put on your homepage. Make a separate show_all action.

Why would Google Search use client-side URL parameters?

Yesterday morning I noticed Google Search was using hash parameters:
http://www.google.com/#q=Client-side+URL+parameters
which seems to be the same as the more usual search (with search?q=Client-side+URL+parameters). (It seems they are no longer using it by default when doing a search using their form.)
Why would they do that?
More generally, I see hash parameters cropping up on a lot of web sites. Is it a good thing? Is it a hack? Is it a departure from REST principles? I'm wondering if I should use this technique in web applications, and when.
There's a discussion by the W3C of different use cases, but I don't see which one would apply to the example above. They also seem undecided about recommendations.

Google has many live experimental features that are turned on/off based on your preferences, location and other factors (probably random selection as well.) I'm pretty sure the one you mention is one of those as well.
What happens in the background when a hash is used instead of a query string parameter is that it queries the "real" URL (http://www.google.com/search?q=hello) using JavaScript, then it modifies the existing page with the content. This will appear much more responsive to the user since the page does not have to reload entirely. The reason for the hash is so that browser history and state is maintained. If you go to http://www.google.com/#q=hello you'll find that you actually get the search results for "hello" (even if your browser is really only requesting http://www.google.com/) With JavaScript turned off, it wouldn't work however, and you'd just get the Google front page.
Hashes are appearing more and more as dynamic web sites are becoming the norm. Hashes are maintained entirely on the client and therefore do not incur a server request when changed. This makes them excellent candidates for maintaining unique addresses to different states of the web application, while still being on the exact same page.
I have been using them myself more and more lately, and you can find one example here: http://blixt.org/js -- If you have a look at the "Hash" library on that page, you'll see my implementation of supporting hashes across browsers.
Here's a little guide for using hashes for storing state:
How?
Maintaining state in hashes implies that your application (I'll call it application since you generally only use hashes for state in more advanced web solutions) relies on JavaScript. Without JavaScript, the only function of hashes would be to tell the browser to find content somewhere on the page.
Once you have implemented some JavaScript to detect changes to the hash, the next step would be to parse the hash into meaningful data (just as you would with query string parameters.)
Why?
Once you've got the state in the hash, it can be modified by your code (or your user) to represent the current state in your application. There are many reasons for why you would want to do this.
One common case is when only a small part of a page changes based on a variable, and it would be inefficient to reload the entire page to reflect that change (Example: You've got a box with tabs. The active tab can be identified in the hash.)
Other cases are when you load content dynamically in JavaScript, and you want to tell the client what content to load (Example: http://beta.multifarce.com/#?state=7001, will take you to a specific point in the text adventure.)
When?
If you had a look at my "JavaScript realm" you'll see a border-line overkill case. I did it simply because I wanted to cram as much JavaScript dynamics into that page as possible. In a normal project I would be conservative about when to do this, and only do it when you will see positive changes in one or more of the following areas:
User interactivity
Usually the user won't see much difference, but the URLs can be confusing
Remember loading indicators! Loading content dynamically can be frustrating to the user if it takes time.
Responsiveness (time from one state to another)
Performance (bandwidth, server CPU)
No JavaScript?
Here comes a big deterrent. While you can safely rely on 99% of your users to have a browser capable of using your page with hashes for state, there are still many cases where you simply can't rely on this. Search engine crawlers, for example. While Google is constantly working to make their crawler work with the latest web technologies (did you know that they index Flash applications?), it still isn't a person and can't make sense of some things.
Basically, you're on a crossroads between compatability and user experience.
But you can always build a road inbetween, which of course requires more work. In less metaphorical terms: Implement both solutions so that there is a server-side URL for every client-side URL that outputs relevant content. For compatible clients it would redirect them to the hash URL. This way, Google can index "hard" URLs and when users click them, they get the dynamic state stuff!

Recently google also stopped serving direct links in search results offering instead redirects.
I believe both have to do with gathering usage statistics, what searches were performed by the same user, in what sequence, what of the search results the user has followed etc.
P.S. Now, that's interesting, direct links are back. I absolutely remember seeing there only redirects in the last couple of weeks. They are definitely experimenting with something.

Is it OK to include an ID inside the URL?

Well, my question is simple.
Does the ID affect the position of a webpage on Google ?
I have links like this
http://example.com/news/title-slug/15/
and people say to me that I should remove the ID from the URL.
And I belive that is not true. By my logic, you can't depend on the title's slug. I know it should work perfectly fine if there aren't two pages that have the same title, but why should I remove the ID if there is no harm when it's there.

Yes, leave it there.
Google has no business trying to second-guess what each element of a URL represents and changing its index based on that.
URLs by their nature can map to any resource, and I'm pretty sure Google recognises that. All you should do is ensure that multiple URLs don't have the same content by using redirects. So, for example, http://example.com/news/wrong-title-slug/15/ should redirect back to http://example.com/news/title-slug/15/ rather than just echo back the same page. Google doesn't really like duplicate content.

It's fine.
But I would not put that behind the title-slug though. Some url might get more confusing than the others.
http://example.com/entry/how-to-solve-question-45/15
a better one would be :
http://example.com/entry/15/how-to-solve-question-45
Besides, you can't really rely on just the title-slug, because changing the title of an entry means breaking user's bookmark. Not to mention that it is faster to retrieve an entry from the database by an integer ID instead of an url-slug.

The problem here is not whether Google will accept it, but whether or not doing so is user-friendly.
A common reason for keeping the ID in a URL is to ensure that the URL is unique. For example, if two people on here were to create a question named "Jon Skeet Facts" we'd have a problem, whereas with the ID the users are aware that they are two different questions with the same title. This is the same as with relational databases where a unique identifier is required.
In essence, why care what Google thinks? The whole Search Engine Optimisation industry is a farce, and this is coming from someone who has been paid more than once as a SEO Consultant. Why follow what Google wants when you can map Google's intentions by making your website perfect for the user? If you make a good website Google will reward you. The ID has a reason to be there, so keep it in.

I think your fine leaving it in. Seems to make sense as you get the element for identification and the element for being descriptive. It is done on here after all.

Zeus won't strike you down for it. I prefer not to have meaningless numbers in there because it's not very attractive or semantic.

Having the id will NOT hurt your SEO rankings. Having the slug there ensures that the page's main keywords will be indexed so it's all good.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart