Why should I use "Web 2.0"-style URLs? [closed] - url

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
In short, why use something like http://stackoverflow.com/badges/6/supporter instead of something "simpler" (and subjectively, at that) like http://stackoverflow.com/badges/6/.
Even on my own site I've just been using /post/6/ to reference posts (by IDs, even though I still store a slug.) Instead of /post/6/small-rant-on-urls, and in some cases, they can get even more absurd, much more so than is really necessary.

Search Engine Optimisation would be one, as well as making the URL more readable to humans. Search engines generally like your URL, Title and H2 to contain the "topic" of the page.
If you have both in there then you can manually type /ID and get automagically taken to the "flowery" URL with rewriting.. saves your fingers a bit :)

Because you can potentially end up with duplicates if you're not careful. I imagine stack overflow added the ID because there was a high potential for duplicates given the volume of posts created.
Other systems may choose not to use the ID in the URL - for example, a blogging system probably would not need to.
It's a better idea if you have user generated content that results in a new URL created to include a post ID. If the only way new URL's can be created is through administrator type access, you can probably do without it as long as you check for duplicates.

Adding the slug in all links to the content helps with search engines, because search engines will generally use words in the URL itself to help index content.

The reason for including the id in the url is that it makes it easier behind the scenes to retrieve the correct article from the database, as a lookup can be performed on the ID rather than the article's title.
The reason for including the full title of the article, is that Google gives heaps of bonus points for search terms that are matched in the filename.

URL is part of the Web user interface.
There is an eyetracking study of search engine use that found that people spend 24% of their gaze time looking at the URLs in the search results.
Searchers are particularly interested in the URL when they are assessing credibility and usefulness of a destination. If the URL looks like garbage, people are less likely to click on that search hit. On the other hand, if the URL looks like the page will address the user's question, they are more likely to click.

#Greg Hewgill
Adding the slug in all links to the content helps with search engines, because search engines will generally use words in the URL itself to help index content.
I should have clarified a bit: I meant URLs that have both an id and slug in them. I just don't see the point in having something like /post/1/la-la-la-la-text-hahahaha vs /post/1/ or /post/la-la-la-la-text-hahahaha, since the first one would work without the extranous text at the end.

It could be that is faster to get the post in a blog by the id than by the slug, so put the id for the SQL query and the slug for the search engines (SEO).
https://stackoverflow.com/users/58163/movaxes65675
I like the /post/la-la-la-la-text-hahahaha type, i can remember the url, know what the title of the post is (before actually loading the site). Don't like much the /post/1/ it means nothing to me but post #1 (bad for marketing?)
edit: id also helps to avoid duplicates as andybaird pointed

Well, firstly it should be pointed out that the "Web 2.0 style URLs" are actually part of something called REST. Those URLs are sometimes called RESTful URLs. The claimed benefits are:
Provides improved response time and reduced server load due to its support
for the caching of representations;
Improves server scalability by reducing the need to maintain session
state. This means that different
servers can be used to handle
different requests in a session;
Requires less client-side software to be written than other approaches,
because a single browser can access
any application and any resource;
Depends less on vendor software and mechanisms which layer additional
messaging frameworks on top of HTTP;
Provides equivalent functionality when compared to alternative
approaches to communication;
Does not require a separate resource discovery mechanism, due to
the use of hyperlinks in
representations;
Provides better long-term compatibility and evolvability
characteristics than RPC. This is due
to:
The capability of document types such as HTML to evolve without
breaking backwards- or
forwards-compatibility; and
The ability of resources to add support for new content types as they
are defined without dropping or
reducing support for older content
types.

Related

Lack of invariance in stackoverflow URL. Why? [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Why do some websites add “Slugs” to the end of URLs?
This is not a question about stackoverflow, it's a question about a design decision which stackoverflow implements, and I take it as example.
A question on stackoverflow is identified by the following URL (took one from the suggestions)
https://stackoverflow.com/questions/363292/why-is-visual-c-lacking-refactor
Similarly, my user URL is:
https://stackoverflow.com/users/78374/stefano-borini
fact is, only the numeric index is actually used
https://stackoverflow.com/users/78374/
The remaining part can be anything. What is the reason behind such design decision, in particular considering that "cool URIs do not change"
Edit: voting for close after I saw this question which substantially puts the same issue forward. My question is a duplicate
Part of the reason is so you can change your user name or the title of the post (correcting spellings etc.) but leave the URL valid.
It makes SEO sense to have the title in the URL - it makes it a lot more likely that the site will get indexed correctly.
It allows the URL to contain some interesting information for humans and search engines, but still works even if the title changes.
You could store the original "slug" in the database and verify against that as well as the id, but the only thing it prevents is games like this:
Lack of invariance in stackoverflow URL. Why?
:)
Search engines like text in URLs.
Pages are given higher rank when the search terms actually appear in the URL rather than just the page. It's robot sugar, basically.
This allows you to see in the url some text which means something to you. If I look in a history of links a bunch of questions, the number alone would be meaningless. However, having the text there allows me to have some context.
This is SEO (search engine optimization) in action. It helps with the ranking of pages in search results (on google, yahoo, bing, ...), because search engines give higher rankings to pages which URL's contain keywords the user is searching for.
Theoretically, it's for SEO reasons. The software ignores the part following the identifier (the "slug"), but the idea is that search engine crawlers consider the description part of the link text, and thus weigh the resulting page higher in search results. Whether this actually happens in any meaningful way, I don't know for sure.
A more practical use is that you can gain a better idea of where a link's pointing just by inspecting the slug, which is handy if you've got multiple question URLs.
In addition to the benefit for Search Engines (the text in the URL is powerful), the fact is for all practical purposes this URL does not "change". A change can be defined as something which causes a link at some point in the future not to work - this would never be the case with this URL. The varying text at the end does not affect any user's ability to access the underlying resource.
"cool URIs do not change"
Cool URIs can change, as long as the old ones are still fine.
Maybe someone will decide that “Lack of invariance in stackoverflow URL. Why ?” is a bad question title, and change it to “Why is there redundant information in SO URLs?”. It would be good if the slug can update to reflect the new title, especially if the reason we wanted to change it was an embarrassing typo. But the old URI must continue to work.
One drawback to non-canonical URIs is that search engines can get confused and think they're two different pages. Or they'll spot that they're two pages the same, but decide that the ‘best’ page to link to is the one with the title you don't want. This is especially bad if lots of people link to another title completely like:
http://stackoverflow.com/questions/1534161/stackoverflow-smells-of-poo
cue more embarrassment. The best way around this (though few sites bother, and SO doesn't) is to check the slug on submission and do a 301 permanent redirect to the new URI with the up-to-date slug instead. Search engines will pick up the new URI and not any malicious one with poo in it.

Is having a descriptive URL needed to be a web 2.0 website? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
Our in-house built CMS system has the ability to have descriptive url (Descriptive URLs vs. Basic URLs) versus basic urls (http://test.com/index.php?id=34234) We want to know other than giving a little more feedback to crawlers out there, if will mean something else.
Do having this descriptive urls bring us other benefits?
Should we limit the size of the URL to certain amount of words?
Thanks for you time.
There are several benefits to descriptive URIs:
It can help with search engine optimization if they include relevant keywords
URIs without query parameters can be cached for GET requests (query parameters prevent caching)
They are descriptive to the user, so their location within the site is clearer to them. This is helpful if they save the link too, or give it to a friend. The web benefits from semantic content, and this is just another way to provide it.
They may also be able to modify the URI directly, though this is a potential downside too.
It is generally good to keep the length under 256 characters due to legacy constraints, but today, the actual limit in practice is not well defined.
Descriptive URLS feature major SEO benefits, as search engines weigh the contents of the URL heavily.
There are many benefits to it. Not only do they work better for SEO, but they are often times hackable for your end-users.
https://stackoverflow.com/questions/tagged/php
That tells me pretty straight forward that I'm going to find questions tagged as "PHP." Without knowing any special rules, I could guess how to find the jQuery questions.
You will run into a limit on the amount of space you can squeeze into a url, but limit the urls to core-terms (like titles to an article, etc) and you'll be fine.
One suggestion is to use these types of urls, but have a fall-back plan. For instance, the url to this question is:
Is having a descriptive URL needed to be a web 2.0 website?
The first parameter is 1347835, which is the question id. The second parameter is the question title. The title here is completely optional. It's not needed to access this page, but when you use it in links it increases the SEO for this page.
If you were to require the title be exact, that may cause more problems than you want. Make the SEO-content like this optional for loading the content itself. SO only requires the question-id, as I stated before.

Best way to format pretty URLs for numeric IDs

Alright, so let's say I'm writing a forum application, and I want pretty URLs. However, all my tables use numeric IDs, so I'm not sure the best way to format the URLs for those resources. Let's pretend I'm trying to get a topic with ID 123456 and title This is a forum post. I've seen it done a couple ways:
www.example.com/topic/123456
www.example.com/topic/this-is-a-forum-post
www.example.com/topic/123456/this-is-a-forum-post
Which one would you say is, taking all things into consideration (including SEO), the optimal URL?
Sorry if this question is too vague, but it seems programming-related and it's not incredibly open-ended, as I just want to hear the pros and cons of each method.
I would go with option 3, and make the slug (the last bit) optional
Because?
The ID will always be unique... 2 people may make a thread with the name 'good news' for example
The search bots can access the slug for some SEO goodness
The slug should be optional ... Using just the ID should still give you access to the site. Perhaps if the slug isn't there you could forward to the slug'd version, if you're concerned about duplicate content. You could always use the canonical meta tag to tell Google to index the slugged version.
Another benefit of the optional slug is if someone copies and pastes the URL into a document, there is a chance it could have characters at the end chopped off (because URLs generally don't have spaces, so they don't break to new lines). Having the slug optional means there is more of a chance people will find your page.
I believe this is what Stack Overflow does.. and also notice they are doing rather well in the Search Engines.
Update
From the comments, be sure to 301 redirect any missing slug version to the correct slug.
URL 1 is definitely suboptimal. URL 2 is attractive but you run the risk of confusion if tags collide, especially if they differ only in punctuation. So I'd say URL 3 is the clear winner.
Also note that just because you display URL 3 is no reason not to accept all 3, with the other two redirecting. If URL 2 is ambiguous, it should redirect to a disambiguation page.
I would think that the 2nd URL would be the best for SEO since it is meaningful and has less depth. It's nicer for people as well since you can look at the URL and know what the content is about.
Doesn't include the title, so you'll lose the additional SEO value of having those keywords in the URL.
Won't work well, because it doesn't have a unique numerical ID, so what are you going to do if someone else tries to post a topic titled "This is a forum post"? Then you start getting into the weird thing digg does, where it has to give the second one the url "http://www.example.com/topic/this-is-a-forum-post_2", and so on. It makes it harder to take the URL they tried to load, and figure out exactly which topic they were trying to get to.
Has the best of both worlds, this would be my style of choice.
Stackoverflow seems to using pattern 3, with the title being ignored completely (just the id is used).
That makes for nice semantic URL, and is also easy to implement, and still works if the title changes later.
Of course, the title could be completely fake:
Best way to format pretty URLs for numeric IDs
I'll go for the first one. You know it really doesn't matter now. Since there are Long URLs converter and it will just proliferate and will become the norm in the future. Remember the longer your URL the less SEO points you'll get.
And you can't control the way people name their forum topics. So really, I'll just choose the first one for simplicity and the norm.
For SEO/traffic, definitely no.2 without a doubt. Get those meaningless numbers out of the URL every single time.
www.example.com/topic/this-is-a-forum-post
pickup the "this-is-a-forum-post" from your database and map it back to the ID number within your database via a query. Then do an internal URL re-write to the real page, something like /topic.php?ID=324342
I would go with option 2, as SEO can better understand.
Stack Overflow uses the third way, probably, that is the reason, Stack Overflow urls were not optimized for SEO. I am not sure in the above answer.
But In my experience with Google, Quite Often, I could see a solution from other forums, whereas stackoverflow solutions were almost invisible.
Best way to format pretty URLs for numeric IDs
Best way to format pretty URLs for numeric IDs
if the both urls were one and the same, the SEO simply goes with option 2, which is less optimized.
I'm not convinced longer URL's are SEO trouble. The depth seems to be a bigger issue, and not by counting slashes, but by steps it takes to get from an indexed page with rank to the content page. I recently created a dummy test page titled /content/roofing/how-much-does-a-shingle-roof-cost.html and threw it on the server just to test pathways and make sure my directories were working correctly. I'm not even sure how google discovered the page but it did and it started getting traffic, so I had to give it content and make it part of the family. The dummy content was a copy of our about page so it wasn't empty, but I was surprised an unpromoted page would get traction, and think the URL had something to do with that.
Which brings up a slight alternative to the above 3 choices for a URL. What if you went with number 3 but added .html to the end? I generally do this with dynamic URL's but I have no concrete evidence that it's helpful. According to Google they brag that they can index dynamic URL's just fine and so there's no need to do URL rewrites at all. Google doesn't mind a bit if the other engines aren't as good at that. Several sites I trust add the html at the end (blogger for example) and it can't hurt, so I still do it.
i would suggest the first one, since the topic title can be changed for clarity, by the admins and then the url will be inconsistent.
www.example.com/topic/123456
also allows one to just edit the last bit of the url (the numbers and jump to another topic), not likely to happen but still a usable feature.

Best approach to make a localized website [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
What's the best way to make a website localized to multi languages?
I'm working on a website, and our manager wants it to be like:
http://www.website.com - default to english
http://fr.website.com - french
http://de.website.com - german
He says it's good for SEO, some developers wants to make it based on cookie and user's accept-language, so the url would always be http://website.com but content would be based on cookie/accept-language.
What you think?
thanks!
This article appears to have a good guide to your question: http://www.antezeta.com/blog/domains-seo/
Essentially, they recommend localizing by TLD most, followed by Subdomain, followed by directories
Cookies are a bad idea because Google will not be able to index your localized content.
This might be late answer but I will give you anyway (my hope is it will benefit others).
Should http://www.example.com/ default to English?
No. You should always detect User's preferred language. That is, web browser will give you AcceptLanguage header with languages that end user is able to understand. If it happens that the most preferred one is not the one that your web site/web application supports, you should try to fall back to next language from AcceptLanguage. Only when nothing fits, you should fall back to your default language (usually English, United States).
Should we used languages as part of domain?
It seems a good idea. When you detected the language, you might want to redirect user to appropriate page. It could be something like http://french.example.com/, http://german.example.com/ or http://www.example.com/index.html?lang=fr.
It is good to have such mechanism implemented - in this case one could actually bookmark correct language. Of course, if somebody navigates to your web site with language as a parameter, you will skip detection as it is pointless at this time.
To sum up:
You may should detect language that web browser serves you and appear as you have multiple web sites (one language each). That is how user might choose which one to bookmark. And of course web search engines will probably index the contents separately, but they would rather look for robots.txt, so... Either way it is good to appear as several language-specific web sites.
I once heard a teacher of mine say that when he does this, he simple makes php files called "eng.php" "fr.php" and so on...
In these files are associative arrays. The key's are always the same but the translation is different.
Then you need only require the correct language file at the top of you PHP files and if you parse the keys, it'll always be in the correct language.
Most open-sourced approaches to localization and globalization involve a lot of developer overhead and complexity in maintenance as copy and code become more complex.
My current company Localize.js solves this complex pain point seamlessly, by tracking website phrase changes, automated ordering of translations, as well as dynamic rendering of languages for you.
https://localizejs.com/
Feel free to email me # johnny#localizejs.com, if you have any questions

Why do some websites add "Slugs" to the end of URLs? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Many websites, including this one, add what are apparently called slugs - descriptive but as far as I can tell useless bits of text - to the end of URLs.
For example, the URL the site gives for this question is:
https://stackoverflow.com/questions/47427/why-do-some-websites-add-slugs-to-the-end-of-urls
But the following URL works just as well:
https://stackoverflow.com/questions/47427/
Is the point of this text just to somehow make the URL more user friendly or are there some other benefits?
The slugs make the URL more user-friendly and you know what to expect when you click a link. Search engines such as Google, rank the pages higher if the searchword is in the URL.
Usability is one reason, if you receive that link in your e-mail, you know what to expect.
SEO (search engine optimization) is another reason. Search engines such as google will rank your page higher for the keywords contained in the url
I recently changed my website url format from:
www.mywebsite.com/index.asp?view=display&postid=100
To
www.mywebsite.com/this-is-the-title-of-the-post
and noticed that click through rates to article increased about 300% after the change. It certainly helps the user decide if what they're thinking of clicking on is relevant, in terms of SEO purposes though I have to say I've seen little impact after the change
I agree with other responses that any mis-typed slug should 301-redirect to the proper form. In other words, https://stackoverflow.com/questions/47427/wh should redirect to https://stackoverflow.com/questions/47427/why-do-some-websites-add-slugs-to-the-end-of-urls . It has one other benefit that hasn't been mentioned--if you do not do a redirect to a canonical URL, it will appear that you have a near-infinite number of duplicate pages. Google hates duplicate content.
That said, you should really only care about the content ID and allow any input for the slug as long as you redirect. Why?
https://stackoverflow.com/questions/47427/why-do-some-websites-add-slugs-to-the-end-of-urls
... Oops, the mail software cut off the end of the URL! No problem though because you still can roll with just https://stackoverflow.com/questions/47427
The one big problem with this approach is if you derive the slug from the title of your content, how are you going to deal with non-ASCII, UTF-8 titles?
The reason most sites use it is probably SEO (Search Engine Optimization). Yahoo used to give a reasonable weighting to the presence of the search keyword in the URL itself, and it also helped in the Google result as well.
More recently the search engines have lowered the weighting given to keywords in the URL, likely because the technique is now more common on spam sites than legitimate. Keywords in the URL now have only a very minor impact on the search results, if at all.
As for stackoverflow itself, SEO might be a motivation (old habits die hard) or simply for usability.
It's basically a more meaningful location for the resource. Using the ID is perfectly valid but it means more to machines than people.
Strictly speaking the ID shouldn't be needed if the slug is unique, you can more easily ensure unique slugs by scoping them inside dates.
ie:
/2008/sept/06/why-some-websites-add-slugs-end-of-urls/
Basically this exploits the low likelihood of two identical slugs being in use on the same day. If there is a clash the general convention is to add a counter at the end of the slug but it's rare that you ever see these:
/2008/sept/06/why-some-websites-add-slugs-end-of-urls/
/2008/sept/06/why-some-websites-add-slugs-end-of-urls-1/
/2008/sept/06/why-some-websites-add-slugs-end-of-urls-2/
A lot of slug algorithms also get rid of common words like "the" and "a" to assist in keeping the URL short. This scoped approach also makes it very straightforward to find all resources for a given day, month or year - you simply chop off segments.
Additionally, stackoverflow URLs are bad in the sense that they introduce an additional segment in order to feature the slug, which is a violation of the idea that each segment should represent descending a resource hierarchy.
The term slug comes from the newspaper/publishing business. It's a short title that's used to identify a story in progress. People interested in URL semantics started using a short, abbreviated title in their URLs. It also pays off in SEO land, as keywords in URLs add importance to a page.
Ironically, lots of websites have started place a full serialized-with-hyphens version of the titles in their URLs for strictly SEO purposes, which means the term slug no longer quite applies. This also rankles semantic purists, as many implementations just tack this serialized version of the title at the end of their URLs.
I note that you can change the text freely. This URL appears to work just as well.
https://stackoverflow.com/questions/47427/why-is-billpg-so-very-awesome
As already stated, the 'slug' helps people and the search engines...
Something worth noticing, is that in the source of the page there is a canonical url
This stops the page from being index multiple times.
Example:
<link rel="canonical" href="http://stackoverflow.com/questions/47427/why-do-some-websites-add-slugs-to-the-end-of-urls">
Remove the formatting from your question, and you'll see part of the answer:
https://stackoverflow.com/questions/47427/
vs
https://stackoverflow.com/questions/47427/why-do-some-websites-add-slugs-to-the-end-of-urls
With no markup, the second one is self-descriptive.
Don't forget readability when sending a link, not just in search engines. If you email someone the first link they can look at the URL and get a general idea of what it is about. The second one gives no indication of the content of that page before they click.
If you emailed someone a link wouldn't it make more sense to include a description by actually writing out a description rather than making the other person parse to the URL where the description exists, and try-to-read-a-bunch-of-hyphenated-words-stuck-together.
First off, it's SEO and user friendly, but in the case of the example (this site), it's not done well or correctly
(as it is open to black hat tricks and rank poisoning by others, which would reflect badly on this site).
If
https://stackoverflow.com/questions/47427/why-do-some-websites-add-slugs-to-the-end-of-urls
has the content, then
https://stackoverflow.com/questions/47427/
and
https://stackoverflow.com/questions/47427/any-other-bollix
should not be duplicates. They should actually automatically detect the link followed is not using the current text (as obviously the slug is defined by the question title and can be later edited) and they should redirect 301 automatically to
https://stackoverflow.com/questions/47427/why-do-some-websites-add-slugs-to-the-end-of-urls
thus ensuring the "one piece of content to one URI" rule, and if the URI moves/changes, ensure the old bookmarks follow/move with it through 301 redirects (so intelligent browsers can update the bookmarks).
Ideally, the "slug" should be the only identifier needed. In practice, on dynamic sites such as this, you either have to have a unique numerical identifier or start appending/incrementing numbers to the "slug" like Digg does.

Resources