Best approach to make a localized website [closed] - localization

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
What's the best way to make a website localized to multi languages?
I'm working on a website, and our manager wants it to be like:
http://www.website.com - default to english
http://fr.website.com - french
http://de.website.com - german
He says it's good for SEO, some developers wants to make it based on cookie and user's accept-language, so the url would always be http://website.com but content would be based on cookie/accept-language.
What you think?
thanks!

This article appears to have a good guide to your question: http://www.antezeta.com/blog/domains-seo/
Essentially, they recommend localizing by TLD most, followed by Subdomain, followed by directories
Cookies are a bad idea because Google will not be able to index your localized content.

This might be late answer but I will give you anyway (my hope is it will benefit others).
Should http://www.example.com/ default to English?
No. You should always detect User's preferred language. That is, web browser will give you AcceptLanguage header with languages that end user is able to understand. If it happens that the most preferred one is not the one that your web site/web application supports, you should try to fall back to next language from AcceptLanguage. Only when nothing fits, you should fall back to your default language (usually English, United States).
Should we used languages as part of domain?
It seems a good idea. When you detected the language, you might want to redirect user to appropriate page. It could be something like http://french.example.com/, http://german.example.com/ or http://www.example.com/index.html?lang=fr.
It is good to have such mechanism implemented - in this case one could actually bookmark correct language. Of course, if somebody navigates to your web site with language as a parameter, you will skip detection as it is pointless at this time.
To sum up:
You may should detect language that web browser serves you and appear as you have multiple web sites (one language each). That is how user might choose which one to bookmark. And of course web search engines will probably index the contents separately, but they would rather look for robots.txt, so... Either way it is good to appear as several language-specific web sites.

I once heard a teacher of mine say that when he does this, he simple makes php files called "eng.php" "fr.php" and so on...
In these files are associative arrays. The key's are always the same but the translation is different.
Then you need only require the correct language file at the top of you PHP files and if you parse the keys, it'll always be in the correct language.

Most open-sourced approaches to localization and globalization involve a lot of developer overhead and complexity in maintenance as copy and code become more complex.
My current company Localize.js solves this complex pain point seamlessly, by tracking website phrase changes, automated ordering of translations, as well as dynamic rendering of languages for you.
https://localizejs.com/
Feel free to email me # johnny#localizejs.com, if you have any questions

Related

How to handle existing indexed Mixed Case url's? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
I have an asp.net web forms application that has been live for a number of years and as such has quite a lot of indexed content on google.
Ideally, I'd prefer that all Url's for the website are in lowercase but I understand that having 2 versions of the same content indexed in search engines (MixedCase.aspx and mixedcase.aspx) will be bad for seo.
I was wondering:
a) Should I just leave everything in its current Mixed Case form and never change it?
OR
b) I can change the code so everything is in lowercase from here on in, BUT, is there a way of doing this so as the search engines are aware of this change and don't penalise me?
Having two versions of the same URL will cause duplicate content issues, although the search engines are generally smart enough to know that the two pages are the same.
There are two good solutions. The first is to use the canonical meta tag to specify your preferred version of the URL. With this solution, both MixedCase.aspx and mixedcase.aspx would show the same page, but the search engines know for definite which is the "correct" URL to show. Make sure you update all your links to the lowercase version.
The second solution is to use 301 Redirects. Usually this is preferred because users will always wind up at the correct page. If they decide to link to it, they're using the correct version. As Rocky says, the redirects will need to stay in place permanently if you already have links from other sites. However, technical (or time) limitations may mean you need to use the canonical method.
You are wise to be wary of having two URLs serving the same content, as you will experience duplicate content issues from the search engines.
You can transfer your URLs, and their PR, from mixed case to lower case without too much of an issue by providing a 301 response code on the old mixed case URLs to the new lower case URLs.
So you would essentially have two URLs for every page:
Old mixed case URL which 301 redirects to the lower case URL
New lower case URL which serves the content
You will need to keep the old URLs in effect for a long time, possibly permanently (e.g. especially if there are third party links to them). Having done this myself, the search engines will continue to request the old URLs for years, even when they know that they redirect to the new URLs (Yahoo, in particular, was guilty of this).
force lowercase, redirect all mixed case URLs HTTP 301 to the lowercase version.

Is it good idea to use URL names with special characters? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Is it good SEO to have URL's (page names) with non-english characters like Chinese names in URL's?
from a SEO perspective
GENERAL URL RULES:
all URLs on an a webproperty must be according to these rules (listed in priority)
1) unique (1 URL == 1 ressource)
2) permanent (they do not change)
3) manageable (1 logic per site
section, no comlplicated exceptions)
4) easily scaleable logic
5) short
6) with a targeted keyword phrase
the targeted keyword phrase is the least important - but it's still important. if you can have a short, scaleable, manageable, permanent, unique URL logic - with non-english characters, then go for it.
there are benifits if the URL match the search term, as the search term gets highlighted in the SERPs, additionally the URL is the most used anchor text (as people tend to copy & paste URLs), so you get a cool anchor text if you use the keyword (in whatever language) in the URL, also the URL keyword is seen as content and adds context to the page, another SEO plus.
so yeah, go for it, but only if it does not work agains principle 1 to 5
As of June this year ICANN have approved the use of chinese characters in domains without the use of .cn at the end.
I wouldn't, for a simple reason: E-mail.
The e-mail protocol does not (yet?) include those characters. So if your domain would be www.äüö.com, you could not use the mail addresses <...>#äöü.com.
See the first comment for a work-around.
No it is not. First, you will have problems registering your domainname in the dns system (you have to resolve it to punycode)
Second, Googlebot and BingBot value keywords in URLs very much (PageRank), which unforunately won't be recognized if your URL is punycode/whatever encoded (well, maybe Google fixed it, but MS probably won't for another year or two).
Third, as far as pagenames are concerned, the browser will have to support these languages, which is not certain for anything that isn't english.
Simply no...
First of all, SEO wants your url to be easily accessible, and i am not sure if people writes an url easily like:
www.çakıöğünüveşarkı.com
So first of all, your url will be so unfirendly... This site is a simple tool for url chech for SEO...
Most web based frameworks supports slugification of your page names isto accessible urls.
So,
1- keep your urls accessible
2- define your page title and meta tags so spiders reas them properly since meta tags do not have any problem with special characters...
I am not very sure about SEO. But since you have tagged it with usability, I wish to add that it won't be a very good idea. It will be next to impossible for someone with non chinese keyboard layout to type your url. Unless it is extremely important in SEO, I would advice you to stay away from it.
If most of the users are Chinese who search in native language the answer is YES
URLs cannot contain non-ASCII characters. But it is possible to encode non-ASCII characters in ASCII.
In the domain name part, you can use IDN. I don't know how well-supported that is, but it's there.
In the path prt, you can use the % escape notation on unicode codepoints. This is well-supported by current browsers, and understood by search engines - so it is indeed good SEO. We're using it for European accented characters, and it all works fine.

Is having a descriptive URL needed to be a web 2.0 website? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
Our in-house built CMS system has the ability to have descriptive url (Descriptive URLs vs. Basic URLs) versus basic urls (http://test.com/index.php?id=34234) We want to know other than giving a little more feedback to crawlers out there, if will mean something else.
Do having this descriptive urls bring us other benefits?
Should we limit the size of the URL to certain amount of words?
Thanks for you time.
There are several benefits to descriptive URIs:
It can help with search engine optimization if they include relevant keywords
URIs without query parameters can be cached for GET requests (query parameters prevent caching)
They are descriptive to the user, so their location within the site is clearer to them. This is helpful if they save the link too, or give it to a friend. The web benefits from semantic content, and this is just another way to provide it.
They may also be able to modify the URI directly, though this is a potential downside too.
It is generally good to keep the length under 256 characters due to legacy constraints, but today, the actual limit in practice is not well defined.
Descriptive URLS feature major SEO benefits, as search engines weigh the contents of the URL heavily.
There are many benefits to it. Not only do they work better for SEO, but they are often times hackable for your end-users.
https://stackoverflow.com/questions/tagged/php
That tells me pretty straight forward that I'm going to find questions tagged as "PHP." Without knowing any special rules, I could guess how to find the jQuery questions.
You will run into a limit on the amount of space you can squeeze into a url, but limit the urls to core-terms (like titles to an article, etc) and you'll be fine.
One suggestion is to use these types of urls, but have a fall-back plan. For instance, the url to this question is:
Is having a descriptive URL needed to be a web 2.0 website?
The first parameter is 1347835, which is the question id. The second parameter is the question title. The title here is completely optional. It's not needed to access this page, but when you use it in links it increases the SEO for this page.
If you were to require the title be exact, that may cause more problems than you want. Make the SEO-content like this optional for loading the content itself. SO only requires the question-id, as I stated before.

Why should I use "Web 2.0"-style URLs? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
In short, why use something like http://stackoverflow.com/badges/6/supporter instead of something "simpler" (and subjectively, at that) like http://stackoverflow.com/badges/6/.
Even on my own site I've just been using /post/6/ to reference posts (by IDs, even though I still store a slug.) Instead of /post/6/small-rant-on-urls, and in some cases, they can get even more absurd, much more so than is really necessary.
Search Engine Optimisation would be one, as well as making the URL more readable to humans. Search engines generally like your URL, Title and H2 to contain the "topic" of the page.
If you have both in there then you can manually type /ID and get automagically taken to the "flowery" URL with rewriting.. saves your fingers a bit :)
Because you can potentially end up with duplicates if you're not careful. I imagine stack overflow added the ID because there was a high potential for duplicates given the volume of posts created.
Other systems may choose not to use the ID in the URL - for example, a blogging system probably would not need to.
It's a better idea if you have user generated content that results in a new URL created to include a post ID. If the only way new URL's can be created is through administrator type access, you can probably do without it as long as you check for duplicates.
Adding the slug in all links to the content helps with search engines, because search engines will generally use words in the URL itself to help index content.
The reason for including the id in the url is that it makes it easier behind the scenes to retrieve the correct article from the database, as a lookup can be performed on the ID rather than the article's title.
The reason for including the full title of the article, is that Google gives heaps of bonus points for search terms that are matched in the filename.
URL is part of the Web user interface.
There is an eyetracking study of search engine use that found that people spend 24% of their gaze time looking at the URLs in the search results.
Searchers are particularly interested in the URL when they are assessing credibility and usefulness of a destination. If the URL looks like garbage, people are less likely to click on that search hit. On the other hand, if the URL looks like the page will address the user's question, they are more likely to click.
#Greg Hewgill
Adding the slug in all links to the content helps with search engines, because search engines will generally use words in the URL itself to help index content.
I should have clarified a bit: I meant URLs that have both an id and slug in them. I just don't see the point in having something like /post/1/la-la-la-la-text-hahahaha vs /post/1/ or /post/la-la-la-la-text-hahahaha, since the first one would work without the extranous text at the end.
It could be that is faster to get the post in a blog by the id than by the slug, so put the id for the SQL query and the slug for the search engines (SEO).
https://stackoverflow.com/users/58163/movaxes65675
I like the /post/la-la-la-la-text-hahahaha type, i can remember the url, know what the title of the post is (before actually loading the site). Don't like much the /post/1/ it means nothing to me but post #1 (bad for marketing?)
edit: id also helps to avoid duplicates as andybaird pointed
Well, firstly it should be pointed out that the "Web 2.0 style URLs" are actually part of something called REST. Those URLs are sometimes called RESTful URLs. The claimed benefits are:
Provides improved response time and reduced server load due to its support
for the caching of representations;
Improves server scalability by reducing the need to maintain session
state. This means that different
servers can be used to handle
different requests in a session;
Requires less client-side software to be written than other approaches,
because a single browser can access
any application and any resource;
Depends less on vendor software and mechanisms which layer additional
messaging frameworks on top of HTTP;
Provides equivalent functionality when compared to alternative
approaches to communication;
Does not require a separate resource discovery mechanism, due to
the use of hyperlinks in
representations;
Provides better long-term compatibility and evolvability
characteristics than RPC. This is due
to:
The capability of document types such as HTML to evolve without
breaking backwards- or
forwards-compatibility; and
The ability of resources to add support for new content types as they
are defined without dropping or
reducing support for older content
types.

Why do some websites add "Slugs" to the end of URLs? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Many websites, including this one, add what are apparently called slugs - descriptive but as far as I can tell useless bits of text - to the end of URLs.
For example, the URL the site gives for this question is:
https://stackoverflow.com/questions/47427/why-do-some-websites-add-slugs-to-the-end-of-urls
But the following URL works just as well:
https://stackoverflow.com/questions/47427/
Is the point of this text just to somehow make the URL more user friendly or are there some other benefits?
The slugs make the URL more user-friendly and you know what to expect when you click a link. Search engines such as Google, rank the pages higher if the searchword is in the URL.
Usability is one reason, if you receive that link in your e-mail, you know what to expect.
SEO (search engine optimization) is another reason. Search engines such as google will rank your page higher for the keywords contained in the url
I recently changed my website url format from:
www.mywebsite.com/index.asp?view=display&postid=100
To
www.mywebsite.com/this-is-the-title-of-the-post
and noticed that click through rates to article increased about 300% after the change. It certainly helps the user decide if what they're thinking of clicking on is relevant, in terms of SEO purposes though I have to say I've seen little impact after the change
I agree with other responses that any mis-typed slug should 301-redirect to the proper form. In other words, https://stackoverflow.com/questions/47427/wh should redirect to https://stackoverflow.com/questions/47427/why-do-some-websites-add-slugs-to-the-end-of-urls . It has one other benefit that hasn't been mentioned--if you do not do a redirect to a canonical URL, it will appear that you have a near-infinite number of duplicate pages. Google hates duplicate content.
That said, you should really only care about the content ID and allow any input for the slug as long as you redirect. Why?
https://stackoverflow.com/questions/47427/why-do-some-websites-add-slugs-to-the-end-of-urls
... Oops, the mail software cut off the end of the URL! No problem though because you still can roll with just https://stackoverflow.com/questions/47427
The one big problem with this approach is if you derive the slug from the title of your content, how are you going to deal with non-ASCII, UTF-8 titles?
The reason most sites use it is probably SEO (Search Engine Optimization). Yahoo used to give a reasonable weighting to the presence of the search keyword in the URL itself, and it also helped in the Google result as well.
More recently the search engines have lowered the weighting given to keywords in the URL, likely because the technique is now more common on spam sites than legitimate. Keywords in the URL now have only a very minor impact on the search results, if at all.
As for stackoverflow itself, SEO might be a motivation (old habits die hard) or simply for usability.
It's basically a more meaningful location for the resource. Using the ID is perfectly valid but it means more to machines than people.
Strictly speaking the ID shouldn't be needed if the slug is unique, you can more easily ensure unique slugs by scoping them inside dates.
ie:
/2008/sept/06/why-some-websites-add-slugs-end-of-urls/
Basically this exploits the low likelihood of two identical slugs being in use on the same day. If there is a clash the general convention is to add a counter at the end of the slug but it's rare that you ever see these:
/2008/sept/06/why-some-websites-add-slugs-end-of-urls/
/2008/sept/06/why-some-websites-add-slugs-end-of-urls-1/
/2008/sept/06/why-some-websites-add-slugs-end-of-urls-2/
A lot of slug algorithms also get rid of common words like "the" and "a" to assist in keeping the URL short. This scoped approach also makes it very straightforward to find all resources for a given day, month or year - you simply chop off segments.
Additionally, stackoverflow URLs are bad in the sense that they introduce an additional segment in order to feature the slug, which is a violation of the idea that each segment should represent descending a resource hierarchy.
The term slug comes from the newspaper/publishing business. It's a short title that's used to identify a story in progress. People interested in URL semantics started using a short, abbreviated title in their URLs. It also pays off in SEO land, as keywords in URLs add importance to a page.
Ironically, lots of websites have started place a full serialized-with-hyphens version of the titles in their URLs for strictly SEO purposes, which means the term slug no longer quite applies. This also rankles semantic purists, as many implementations just tack this serialized version of the title at the end of their URLs.
I note that you can change the text freely. This URL appears to work just as well.
https://stackoverflow.com/questions/47427/why-is-billpg-so-very-awesome
As already stated, the 'slug' helps people and the search engines...
Something worth noticing, is that in the source of the page there is a canonical url
This stops the page from being index multiple times.
Example:
<link rel="canonical" href="http://stackoverflow.com/questions/47427/why-do-some-websites-add-slugs-to-the-end-of-urls">
Remove the formatting from your question, and you'll see part of the answer:
https://stackoverflow.com/questions/47427/
vs
https://stackoverflow.com/questions/47427/why-do-some-websites-add-slugs-to-the-end-of-urls
With no markup, the second one is self-descriptive.
Don't forget readability when sending a link, not just in search engines. If you email someone the first link they can look at the URL and get a general idea of what it is about. The second one gives no indication of the content of that page before they click.
If you emailed someone a link wouldn't it make more sense to include a description by actually writing out a description rather than making the other person parse to the URL where the description exists, and try-to-read-a-bunch-of-hyphenated-words-stuck-together.
First off, it's SEO and user friendly, but in the case of the example (this site), it's not done well or correctly
(as it is open to black hat tricks and rank poisoning by others, which would reflect badly on this site).
If
https://stackoverflow.com/questions/47427/why-do-some-websites-add-slugs-to-the-end-of-urls
has the content, then
https://stackoverflow.com/questions/47427/
and
https://stackoverflow.com/questions/47427/any-other-bollix
should not be duplicates. They should actually automatically detect the link followed is not using the current text (as obviously the slug is defined by the question title and can be later edited) and they should redirect 301 automatically to
https://stackoverflow.com/questions/47427/why-do-some-websites-add-slugs-to-the-end-of-urls
thus ensuring the "one piece of content to one URI" rule, and if the URI moves/changes, ensure the old bookmarks follow/move with it through 301 redirects (so intelligent browsers can update the bookmarks).
Ideally, the "slug" should be the only identifier needed. In practice, on dynamic sites such as this, you either have to have a unique numerical identifier or start appending/incrementing numbers to the "slug" like Digg does.

Resources