I needed to look up something within a XHTML specification over at the W3 Consortium website. So I went to the XHTML2 Working Group Home Page. I was greeted with various encoding issues. Trademarks showing up as â„¢ character sequences. Now, normally when you see a page with an Â or â at the start of a strange sequence you can be fairly certain it is a Unicode encoding, typically UTF-8. So at first I thought my auto-detect within Firefox was not turned on, checked it, no, it was definitely on. Selected unicode as encoding myself and, indeed, the page displayed normally. So I checked the page’s source. I was lovingly greeted by the following:
<?xml version="1.0" encoding="iso-8859-1"?>
I am sure most of you can appreciate the delightful irony that the organization that has a multitude of XML-based standards and specifications, which almost always use UTF-8 as default encoding, encode a page wrongly. Yes, mistakes are human, but to see something like this on the W3C site…
Edit: for some reason WordPress keeps converting my greater and lesser than signs into HTML entities, even when using Unicode entities.