Make Mercurial Filenames Work on Windows

While writing a post for my blog I noticed that on Windows some of the filenames on-disk showed encoding problems. I have this stored in Mercurial, so somewhere from Mercurial to the checkout on Windows something goes wrong where it concerns character encoding.

After some research and conversations with people on the #mercurial IRC channel, I understand that Mercurial stores everything internally in Python's byte encoding. On Windows it will then convert this to its native ANSI codepage, in my case codepage 1252.

Thankfully Windows 10 has a wonderful option nowadays to fix this issue. If you go to Control Panel, click Clock and Region, click Region, click Administrative, and under Language for non-Unicode programs click Change system locale. In the window that pops up tick the checkbox in front of Beta: Use Unicode UTF-8 for worldwide language support. Maybe by the time you are reading this the beta label has already been removed. Click OK and the system needs to restart.

You will need to clone the repository again since Mercurial (TortoiseHg) will need to properly generate the filenames.