I needed to look up something within a XHTML specification over at the W3 Consortium website. So I went to the XHTML2 Working Group Home Page. I was greeted with various encoding issues. Trademarks showing up as â„¢ character sequences. Now, normally when you see a page with an  or â at the start of a strange sequence you can be fairly certain it is a Unicode encoding, typically UTF-8. So at first I thought my auto-detect within Firefox was not turned on, checked it, no, it was definitely on. Selected unicode as encoding myself and, indeed, the page displayed normally. So I checked the page’s source. I was lovingly greeted by the following:
<?xml version="1.0" encoding="iso-8859-1"?> |
I am sure most of you can appreciate the delightful irony that the organization that has a multitude of XML-based standards and specifications, which almost always use UTF-8 as default encoding, encode a page wrongly. Yes, mistakes are human, but to see something like this on the W3C site…
Edit: for some reason WordPress keeps converting my greater and lesser than signs into HTML entities, even when using Unicode entities.
WordPress isn’t very accommodating to those who know what they’re doing with code. ;P I share your pain.