Why are special characters replaced by � in my Google Search results? - character-encoding

My website encoding is ISO-8859-1.
ISO-8859-1 is defined as charset in the web pages and Google Search results have always looked good.
However, for several weeks now, special characters (é, à, è, â, etc.) are replaced by � in the Google Search results, for both page titles and page descriptions.
Screenshot of the Google Search rendering
The charset is defined on each page:
And the website looks good with all web browsers, there is no encoding errors.

Related

HTTrack gives 404 on unicode urls with german special characters

I've realized that HTTrack can't download files if urls have special characters in them, like german ß - it returns a 404 response.
Errors look like on screenshot:
Is there any setting in HTTrack to make it able to deal with such characters?
ps: I found a similar thread, but without an answer:
Httrack faulty when encountering japanese encoded URLS
HTTrack seems to be able to get files errorfree from urls with special characters, only if you don't run a "real" domain crawl, but:
firstly create an url list,
save it as iso-8859-1,
than let HTTrack crawl this list
If HTTrack will explore urls by its own, it will run into 404 errors on urls with special characters - at least i wasn't able to get them errorfree. Maybe somebody will provide a magic setting ;)

Browser Support for UTF8 Encoded Characters in URL's

If I navigate to the following URL with a special UTF8 encoded character I get different results in web browsers:
http://example.com/lörickè
Firefox 37 - Shows the correct URL as above.
Chrome 42 - Shows the correct URL as above.
Edge - Shows the correct URL as above.
IE 11 - Shows percent encoded URL http://example.com/l%c3%b6rick%c3%a8/
Where can I find a list of browsers and versions that support this feature and are there any announcements of whether the new Microsoft Edge browser supports this.
This StackOverflow post highlights the above issue for those interested.
What is shown in browser address bars is not necessarily what is used internally.
If you enter http://example.com/lörickè in Firefox, it gets shown like that, but it actually gets percent-encoded and becomes http://example.com/l%C3%B6rick%C3%A8. This is for usability reasons (or, if IRIs are not supported, like in HTTP/1.1, for transforming an IRI into a URI), so users don’t necessarily have to enter the correct URL (with percent-encoding), and don’t get confused by seeing these cryptic parts.
You can easily check what really gets used by copy-pasting the URL from the address bar into a text document.
So the three browsers from your example probably use the same URI (i.e., percent-encoded), but two browsers decided to display the un-encoded variant instead.

Characters with accents from a MySQL DB showing correctly on PHP pages but not on HTML

I have searched and searched and applied the obvious fixes but it seems I have another variant of the problem. I have PHP pages and these display what song is currently playing, what songs are coming next and last recently played on my web radio station, the info comes from mysql. The characters are displayed correctly on the php pages. This is where it gets tricky, I also have HTML pages which load 2 div's from a php page so that the coming up songs also display on those HTML pages but there that's when the accents characters don't show correctly, I have the correct meta tag in the header on those pages and have also used the .htaccess file trick (although I was not sure how important the location of the line in the file was so tried various places). I even opened my .htaccess in notepad++ to change the encoding to use UTF8 but no BOM. I even added a meta tag for UTF8 in the php page header and then the characters didn't work on php either, probably you're not supposed to. As you can see I spent a lot of time. What's interesting the characters display correctly on iPad, it's on the PC browsers it doesn't work. Maybe no one ever tried this before loading divs from php into HTML and have special characters too. Sounds interesting anyway and if anyone is interested in having a think that would be great but it's not a vital problem just a nice to have fix. The server side of my stuff is hosted on a hosting site
thanks

Internet Explorer does not display Chinese characters from the URL

I am working on a requirement to display (make readable) characters from the URL.
When I use Google Chrome, it displays the parameters in Chinese - even though they are encoded to UTF-8.
When I use Mozilla Firefox, it displays the parameters in Chinese - even though they are encoded to UTF-8.
When I use Internet Explorer, it displays the parameters encoded in UTF-8.
N.B. The URL is encoded to UTF-8; I know that because when I copy the URL from the three of them and paste it to Notepad++ the three of them display the following:
/%E6%89%93%E5%BC%80%E7%9B%AE%E5%BD%95/%E7%9B%B8%E6%9C%BA/%E6%95%B0%E7%A0%81%E7%9B%B8%E6%9C%BA/%E5%B0%8F%E5%9E%8B%E6%95%B0%E7%A0%81%E7%9B%B8%E6%9C%BA/PowerShot-A480/p/1934793
Could it be that Mozilla Firefox and Google Chrome guys have this improvement that can make an encoded String readable and perhaps the IE guys do not support that? Or, is there any way to activate that with IE?
By the way... Going to View >> Encoding >> Unicode (UTF-8) takes care of the text inside of the page but does not make any difference for the text in the URL.
Any help will be greatly appreciated!
I've written a blog post about Internet Explorer not displaying the decoded version of non-ASCII characters and using IRIs to solve the problem.
As of today, we have the following situation:
HTML5 supports IRIs, i.e. URIs with Unicode character support
HTTP does not support IRIs, but all major browsers take care of converting IRIs to valid (encoded) URIs to retrieve the specified resource (page).
IE supports IRIs in the href attribute of anchor tags and properly displays them in its address bar just like when you enter your URL by hand (keyboard ;-)).
If you choose to percent-encode your IRI thus making it a URI, IE will not decode that URI back into an IRI.
So you could try the following:
Save your HTML files using UTF-8. This allows you to insert any Unicode character into it.
Do not percent-encode your URLs inside your HTML pages' links. Just use links like this: 亦思巴奚兵乱
A great article on the topic can also be found at the W3C: An Introduction to Multilingual Web Addresses.

different url in different browsers

I'm using asp.net mvc. I made a link using Html.ActionLink chrome and Firefox shows me an seo friendly url like this:
http://localhost:3267/Store/Browse?category=آموزش-برنامه-نویسی
but IE 9.0 shows me something like this:
http://localhost:3267/Store/Browse?category=%D8%A7%D9%93%D9%85%D9%88%D8%B2%D8%B4-%D8%A8%D8%B1%D9%86%D8%A7%D9%85%D9%87-%D9%86%D9%88%DB%8C%D8%B3%DB%8C
What should I do in order to show friendly seo url also in IE?
Those hexadecimal characters -- the ones that start with the % sign -- are the characters in the string, encoded for the URL. As far as a search engine is concerned, they are the same characters.

Resources