I've been told to understand how to maximize the visibility of an upcoming web application that is initially available in multiple languages, specifically French and English.
I am interested in understanding how the robots, like the google bot, scrapes a site that is available in multiple language.
I have a few questions concerning the behaviour of robots and indexing engines:
Should a web site specify the language in the URL?
Will a robot scrape a site in both language if the language is set through cookies (supposing a link that can change the language)?
Should I use a distinct domain for each language?
What meta tag could be used to help a robot in understanding the language of a web site?
Am I missing anything that I should be aware of?
Yes
No
Not necessarily, Google will infer the language. But if you use different TLD you probably get better exposure in specific countries, but you loss PageRank diluted in different domains.
<meta http-equiv="content-language" content="en">
a. You should add a link in every page, to the same page in the other languages of the
site.
b. For SEO, it's better to use www.mysite.com/en/ that en.mysite.com because the PageRank is not diluted in different domains.
Should a web site specify the language in the URL?
No, not necessarily.
Will a robot scrape a site in both language if the language is set through cookies (supposing a link that can change the language)?
No. You should use a content-language attribute as suggested by Eduardo. Alternatively, <html lang='en'> will do the same job AFAIK.
What meta tag could be used to help a robot in understanding the language of a web site?
See above
Should I use a distinct domain for each language?
The Stack Overflow consensus (I'm sorry, I can't find for the life of me find the relevant questions! We had huge discussions on this, maybe they were closed as not programming related), is: Yes, have a different domain for each country if you want to maximize search engine visibility for that country.
Related
Good day!
I cannot find a complete description of the very items that make this concreate schema useful for business in SERP. I realy don't understand why should irganization markup schema if it provide no benefits in search result? isn't it easier to create an account in Google My Business or in some Catalog with reviews on it? In this case we can see the snippet with 'rating stars'.
For example, there are two sniipets from search result:
organization1 has Schema.org/Organization markup on its' page:
Search result snippet1
organization2 has no markup on its' site, but has it's page in catalog Yelp:
Search result snippet2
Moreover, I cannot understand how can "aggregateRating" (based on a collection of reviews or ratings, of the item) calculate this rating?
Please, can anyone explain it to me?
Check the FAQ section with regard to Schema.org in the About Section Answers related to your question are:
Q: What is the purpose of schema.org?
Schema.org is a joint effort, in the spirit of sitemaps.org, to
improve the web by creating a structured data markup schema supported
by major search engines. On-page markup helps search engines
understand the information on web pages and provide richer search
results. A shared markup vocabulary makes easier for webmasters to
decide on a markup schema and get the maximum benefit for their
efforts. Search engines want to make it easier for people to find
relevant information on the web. Markup can also enable new tools and
applications that make use of the structure.
Q: Why are Google, Bing, Yandex and Yahoo! collaborating? Aren't you competitors?
Currently, there are many standards and schemas for marking up
different types of information on web pages. As a result, it is
difficult for webmasters to decide on the most relevant and supported
markup standards to use. Creating a schema supported by all the major
search engines makes it easier for webmasters to add markup, which
makes it easier for search engines to create rich search features for
users.
There's also a video on youtube about using schema for SEO for you business.
Structured data is a standardized format of code that is added to a web page. It communicates specific information about a page to Google. This makes it easier for search engines to crawl and index your content faster. In other words, it provides the context search engines need to properly categorize your site and recommend it more accurately for relevant search queries.
Google is using this data to make their search engine more accurate by creating a knowledge graph. This graph is an interconnected map of entities that follows the relationship between different terms, facts, data, dates, and more. This allows Google to go from keyword matching to a context-rich search engine, capable of differentiating the Taj Mahal monument from the Taj Mahal casino in Atlantic City.
What it means for SEOs is that Google has given you a way to introduce your client’s brands and companies into their knowledge graph, making them real objects Google knows about and can recommend to users. Check out our structured data guide on how to implement it on your site, including the recommended format for SEO and more on schema markups and aggregateRating.
I have recently been doing a lot of reading on SEO with HTML5 (I am a Rails web developer), and have been doing a lot of work with microdata as I have seen that the Schema.org format is the preferred format of Google.
What I am wondering, is if somebody can explain to me the importance of also including a sitemap?
From what I understand, the crawlers just go through all the links on a page from wherever they come to your site, and then are able to gather all the data they need from well written microdata tags.
So what is the additional benefit of including a sitemap, and is it really worthwhile? It is possible that I am misunderstanding the purpose of a sitemap or the functionality of search engine crawlers.
A consumer can only read the Microdata if it found the document which contains it.
A sitemap is one way (of many ways) that allows consumers to find the document. A common other way is to follow hyperlinks (from plain HTML, no Microdata needed), but there may be sites that don’t link to every document, so consumers would not find these documents that way.
(If it’s worthwhile, e.g. if there’s a SEO benefit, depends on the consumer. A consumer can be a search engine bot, any other bot, a tool, a browser extension, etc.)
I have an ASP.NET MVC application, and the visitors can select from two languages to view the site. My question is: should the url's themselves also be language dependent, like:
/en/approach -> refers to the English page
/nl/aanpak -> refers to the Dutch page
or should I just use /en/approach for both english and dutch pages?
Thanks,
L
It would be better to split them out. It would allow search engines to index more pages and also makes the urls hackable. Also, if you are planning to use output caching, you would easily be able to cache both localization.
Here is a good post about how to accomplish localization with MVC.
How to localize ASP.NET MVC application?
Like Phil said... It helps you, the search engine and helps your users understanding that it's separated content.
And!:
Keep the content for each language on separate URLs. Don’t use cookies to show translated versions of the page. Consider cross-linking each language version of a page. That way, a French user who lands on the German version of your page can get to the right language version with a single click.
Avoid automatic redirection based on the user’s perceived language. These redirections could prevent users (and search engines) from viewing all the versions of your site.
- Google, Multi-regional and multilingual sites
What would be a good way to handle URLs on a website that offers multiple languages, but has one primary language (in my case, English).
What should be the address of the home page in English? http://example.com/? http://example.com/en/? http://example.com/english/? Other?
What should be the address of the home page in another language, say, German? http://example.com/german/? http://example.com/de/? http://example.com/deutsch/?
Would the use of language-specific subdomains be appropriate? What would you do and why?
It kind of depends on the structure of your site:
If every language is considered a completely different site, use sub-domains for the language.
This is because different sub-domains is considered different sites by many technologies. Wikipedia does this (http://de.wikipedia.org/) to separate content for different languages entirely.
I wouldn't recommend you to choose this option unless your site is very big.
If every language has its own structure, but is still considered to be versions of the same site, use a top-level "directory" for languages.
For the sake of consistency, I would say that you should also have one for the default language (and omitting it would cause a redirect to the appropriate structure.) I would recommend you to use /en/, /de/, etc. since it's short and concise, and also the standard way of indicating languages.
This is probably your best bet.
If the structure of the site is identical no matter what language it is, and only content on the pages changes depending on the language, you could also consider putting the language modifier as a parameter: /home?lang=en
Google does this, for example: http://www.google.com/search?hl=de&q=foo (they also separate languages by TLD, though.)
Away from the question of how the international URLs should be styled (as that has been covered adequately already)...
One thing that I would personally do is make the site's 'main' domain (i.e. http://example.com) redirect the user appropriately depending on the Accept-Language HTTP header passed by the browser. This is what google.com does, for example.
If you do this, however, make sure that it's possible to switch to another language easily - and save the settings via some other mechanism to allow persistent override (cookies!).
What should be the address of the home page
Would the use of language-specific subdomains be appropriate?
How you like it, doesn't really matter. Design it to be intuitive to the users.
Language names encrypted in URLs won't matter for SEO because nobody will be searching for "en", "de". The names of the products you're offering however will matter very much, because people will be searching for products like "gifts" or "geschenke".
I think that the better stylish solution is to use the address in the format http://yourdomain.com as the home page URL, and identify the localized web pages with ISO 639-1 language codes
How to implement the multilingual umbraco 3.0?
There are two different approaches to this.
The documentation on the Umbraco website describes how to do 1:1 multingual sites. This means that you have one site structure and different language tabs in a single document type for each translation od the content. This is then selected by using an on page selector on the website (a flag icon or the like).
Here's an example of a 1:1 site
This is the most efficient set up if you have lots of shared content i.e. the content and structure is exactly the same, the language is just different.
The second approach is to use separate page structures for each language, such as:
International Homepage
------> English Homepage
------------> English content page
------> French Homepage
------------> French content page
The advantage of this structure is that it is very easy to set up, but if you share lots of content it can be cumbersome to manage. It also has the advantage that you can lock the editing permissions down for country/language specific editors.
With the above structure you can also point individual URLs to the country pages.
Without knowing more about what exactly your requirements are it's hard to answer more fully as to which is the best approach. It may also be possible to create a hybrid solution.
Here are some links which may help:
http://forum.umbraco.org/yaf_postst2209_Multilingual-structure-in-umbraco.aspx
http://www.nibble.be/?p=32