Well, my question is simple.
Does the ID affect the position of a webpage on Google ?
I have links like this
http://example.com/news/title-slug/15/
and people say to me that I should remove the ID from the URL.
And I belive that is not true. By my logic, you can't depend on the title's slug. I know it should work perfectly fine if there aren't two pages that have the same title, but why should I remove the ID if there is no harm when it's there.
Yes, leave it there.
Google has no business trying to second-guess what each element of a URL represents and changing its index based on that.
URLs by their nature can map to any resource, and I'm pretty sure Google recognises that. All you should do is ensure that multiple URLs don't have the same content by using redirects. So, for example, http://example.com/news/wrong-title-slug/15/ should redirect back to http://example.com/news/title-slug/15/ rather than just echo back the same page. Google doesn't really like duplicate content.
It's fine.
But I would not put that behind the title-slug though. Some url might get more confusing than the others.
http://example.com/entry/how-to-solve-question-45/15
a better one would be :
http://example.com/entry/15/how-to-solve-question-45
Besides, you can't really rely on just the title-slug, because changing the title of an entry means breaking user's bookmark. Not to mention that it is faster to retrieve an entry from the database by an integer ID instead of an url-slug.
The problem here is not whether Google will accept it, but whether or not doing so is user-friendly.
A common reason for keeping the ID in a URL is to ensure that the URL is unique. For example, if two people on here were to create a question named "Jon Skeet Facts" we'd have a problem, whereas with the ID the users are aware that they are two different questions with the same title. This is the same as with relational databases where a unique identifier is required.
In essence, why care what Google thinks? The whole Search Engine Optimisation industry is a farce, and this is coming from someone who has been paid more than once as a SEO Consultant. Why follow what Google wants when you can map Google's intentions by making your website perfect for the user? If you make a good website Google will reward you. The ID has a reason to be there, so keep it in.
I think your fine leaving it in. Seems to make sense as you get the element for identification and the element for being descriptive. It is done on here after all.
Zeus won't strike you down for it. I prefer not to have meaningless numbers in there because it's not very attractive or semantic.
Having the id will NOT hurt your SEO rankings. Having the slug there ensures that the page's main keywords will be indexed so it's all good.
Related
I remember seeing somebody say something in some comments somewhere that it isn't very safe to expose the id of a model record to a web page. I tried looking for some ways around it but couldn't find any specific documentation on the likes.
Does anybody know about why this is and whether or not it should be avoided? Also how it might be avoided?
The only thing a hacker can do with an ID is look it up in your database, and if they can do that, you are already hosed anyway. That's why such IDs are called "fictitious", because they don't relate to anything else in reality, such as a Social Security Number.
Sometimes you should, and sometimes you should not, put an ID into an URI - http://example.com/myController/42. If you do that, secure your page so unauthorized users cannot trivially change the number and see what records they can find. But you need that security anyway, even if you use a "slug" to find records, because users can poke around looking for different slugs. The main reason not to use /42 is URIs are part of your usability envelop, and they should be literate and user-friendly.
And I put database IDs inside HTML IDs all the time, to make them unique, such as <input type="text" name="username" id="username_42" />. That's still just as secure.
It isn't. Your user ID is at the link in your name here at this question and the question ID is just up there in the URL. The only thing that is unsafe is if you allow people that know the IDs but should not have access to do stuff you didn't want them to do.
Safety is built into your app, not if you're showing IDs of your models or not.
I am working on an application that will provide information for certain events, and am wondering what the best way to structure my URI resources is.
The easiest way is to simply use an ID for each event; such as;
Baseurl/Events/{EventId}
The issue with this is that the ID is obviously not something that will be known to the customer. I would prefer to have something more like;
Baseurl/Events/{EventName}
Perhaps a more important reason for doing this is for SEO purposes. If I am targeting a keyword for the event, surely it would be more beneficial to have the event name in the URL?
My issue with using the event name is that obviously it’s not as ‘parseable’ as an ID, in that it becomes sensitive to event name changes etc. Also adding spaces into the URI means that customers aren’t likely explore by typing resource names in, and again could lead to parsing issues.
What is the standard practice in this area? Is using an ID the norm, or using a resource name? If I take Wordpress as an example, I know that the postname can act as the resource identifier, so I know at least one instance of the name being used.
Go for the hybrid approach, much like how StackOverflow is built: use the ID in the URL for your internal usage and append the name afterwards for readability and SEO.
I was going to ask this on Meta but I think it's a general enough question to warrant a place here instead.
I'm interested in knowing some of the ways you manage permalinks in your site, specifically permalinks that are built from data that can change over time.
StackOverflow is a good example of this whereby the URL to a question is partly made up from the question title. Without posting a dud question to test I'm unsure whether the link to the question changes if the title of the question changes. My guess is that it doesn't and if it does, a canonical is likely retained to the origional url.
Changing the title on SO does not change the url
Given that as the case is it common practice to store permalinks against posts in your database? and if so, how much of the permalink would you store?
I ask the latter because there's only one part of the URL that's variable in the context of SO, and that's the question title. So should we store only the sanitized title and build up the rest based on the static information we have from the post, or should we store the whole url including the controller name and Id (etc.)?
What you usually want is some identifier uniquely identifying the data item you want to link to (in SO's case the question). How you build your URL is more a question of what you think you will be able to support for a long time and how to convey additional information to the reader.
If you look at SO URLs, you notice that they put the unique identifier at the beginning (the number after /questions/) which is enough to get to the question (try putting garbage in the rest of the URL, it will still redirect to your question). Therefore, the title at the end is just eyecandy for the user and not really used in resolving the question.
I think it's relatively common to store the permalink in the database. Space is cheap and string parsing functions can be expensive (making a question title HTTP friendly a few thousand times across thousands of questions will eat some processor) each time you want to display the link.
As for how much to store, personally, I would only store the HTTP friendly version of your question/post title in the DB (along with a primary key) for the following reasons.
Storing the entire or even part of the URL that concerns itself with Actions and Controllers will make it really, really hard to refactor/rename those things down the road. You would either need to run mass DB updates or custom URL rewrites, etc.
Only storing the friendly version of the title allows you to use it in other places. Let's take this URL to this question for example, it was probably generated by #Html.ActionLink(Question.Title, "Index", new {controller = "Questions", Id = Question.Id, Slug = Question.Slug}). Keeping the slug as a separate parameter, you can use the questionId and questionSlug parameters in other controller/action calls and keep your URLs pretty.
I'm designing a hosted software-as-a-service application that's like a highly specialized version of 37Signal's Highrise product. In that context, where SEO is a non-issue, is it worth implementing "pretty URLs" instead of going with numeric IDs (e.g. customers/john-smith instead of customers/1234)? I notice that a lot of web applications don't bother with them unless they provide a real value (e.g. e-commerce apps, blogs - things that need SEO to be found via search engines)
Depends on how often URLs are transmitted verbally by its users. People tend to find it relatively difficult to pronounce something like
http://www.domain.com/?id=4535&f=234&r=s%39fu__
and like
http://www.domain.com/john-doe
much better ;)
In addition to readability, another thing to keep in mind is that by exposing an auto-incrementing numeric key you also allow someone to guess the URLs for other resources and could give away certain details about your data. For instance, if someone signs up for your app and sees that their account is at /customer/12, it may effect their confidence in your application knowing that you only have 11 other customers. This wouldn't be an issue if they had a url of /customer/some-company.
It's always worth it if you just have the time to do it right.
Friendly-urls look a lot nicer and they give a better idea where the link will lead. This is useful if the link is shared eg. via instant message.
If you're searching for a specific page from browser history, human readable url helps.
Friendly url is a lot easier to remember (useful in some cases).
Like said earlier, it is also a lot easier to communicate verbally (needed more often than you'd think).
It hides unnecessary technical details from the user. In one case where user id was visible in the url, several users asked why their user id is higher than total amount of users. No damage done, but why have a confused user if you can avoid it.
I sure am a lot more likely to click on a link when I mouseover it, and it has http://www.example.com/something-i-am-interested-in.html.
Rather than seeing http://www.example.com/23847ozjo8uflidsa.asp.
It's quite annoying clicking links on MSDN because I never know what to expect I will get.
When I create applications I try my best to hide its structure from prying eyes - while it's subjective on how much "SEO" you get out of it - Pretty URLs tend to help people navigate and understand where they are while protecting your code from possible injections.
I notice you're using Rails app - so you probably wouldn't have a huge query string like in ASP, PHP, or those other languages - but in my opinion the added cleanliness and overall appearance is a plus for customer interaction. When sharing links it's nicer for customers to be able to copy the url: customer/john_doe than have to hunt for a "link me" or a random /customer/
Marco
I typically go with a combination -- keeping the ease of using Rails RESTful routing while still providing some extended information in URLs.
My app URLs look something like this:
http://example.com/discussions/123-is-it-worth-using-pretty-urls/
http://example.com/discussions/123-is-it-worth-using-pretty-urls/comments
http://example.com/discussions/123-is-it-worth-using-pretty-urls/comments/34567
You don't have to add ANY custom routes to pull this off, you just need to add the following method to your model:
def to_param
[ id, permalink ].join("-")
end
And ensure any find calling params[:id] in your controller is converted to an integer by setting params[:id].to_i.
Just a note, you'll need to set a permalink attribute when your record is saved...
If your application is restful, the URLs that rails gives you are SEO-friendly by default.
In your example, customers/1234 will probably return something like
<h1>Customer</h1>
<p><strong>Name:</strong> John Smith</p>
etc etc
Any current SEO spider will be smart enough to parse the destination page and extract that "John Smith" from there anyway.
So, in that sense, customers/1234 is already a "nice" URL (as opposed to other systems, in which you would have something like resource/123123/1234 for customer 1234 resource/23232/321 for client 321).
Now, if you want your users to be regularly using urls (like in delicious, etc) you might want to start using logins and readable fields instead of ids.
But for SEO, ids are just fine.
FOR SEARCH ENGINE OPTIMIZATION PURPOSES, does the location of the slug within a URL matter?
There's no doubt that you could code URL slugs to work properly in any order. I'm more interested to know if search engines place different weights to portions of the URL on the right-hand-side vs the left-hand-side
For example, here the slug appears at the end of the URL:
Why do some websites add "Slugs" to the end of URLs?
Whereas here the slug appears in the middle of the URL:
https://stackoverflow.com/questions/why-do-some-websites-add-slugs-to-the-end-of-urls/47427
It's better to push whatever has less semantic content to the right because it's more likely to get chopped off by length limits on what's considered relevant. So the second form you post would be better for SEO purposes than the way SO does it. (Better yet is using the slug as a real identifier and keeping semantic-content-free IDs out of it.)
I always go with the Rule that it is important to move from right to left when determining the most important information in your URL for the user (an actual user or google). So the question you have to ask your self, is do you want your user to see the ID or the title as the most important thing of the page.
Also what happens if they drop off the number, and just leave the title. The page blows up right, but what happens if you drop the slug and leave the number. The page functions as normal.
Unfortunately, given Google's (and most other search engine's) security through obscurity (fear of gaming of the system if any clear methods are described/explained), there's just not going to be a clear, demonstrable answer.
On the whole, you can assume that if it seems slimy, Google will penalize it, and if it seems semantic/useful, Google will promote it. In the case of the above url, it's my guess that Google will treat both the same, but that's entirely a guess, and outside of a Google algorithmic engineer stopping in here, I doubt you're going to find anything more definitive.
Parsing the URL is probably a lot easier if the slug is at the end. You can pull out the values you need from the beginning of the path, and then just ignore everything after it. (so the slug could be even more complex than what you have, with multiple "directories", etc). If you put the slug at the beginning or the middle you have to be able to parse that out in order to find what's important.
https://stackoverflow.com/questions/727281/blahblablah-lets-assume-that-this-continues-on-and-on-and-on
Now if you truncate that https://stackoverflow.com/questions/727281/blahbla still works.
In the other case: https://stackoverflow.com/questions/why-do-some-websites-add-slugs-to-the-end-of-urls/47427 truncated https://stackoverflow.com/questions/why-do-so would have no chance to work.