Small touches that inspire

It’s the littlest of things that can really brighten my mood when I notice them. In this case I was watching Fallout: New Vegas’ DLC trailer for Honest Hearts. In the trailer you see the player with a pistol and on one side of the pistol at least is written: “καὶ ἡ σκοτία αὐτὸ οὐ κατέλαβεν”. This is Greek and refers to the second part of the verse of John 1:5 in the New Testament of the bible, meaning in English: “and the darkness did not comprehend it”. In my opinion a great way to bring enlightenment by the bullet.

CLDR 1.8 data submission closing

The data submission phase for CLDR 1.8 should be closed by now (although the survey tool still says it’s accepting submissions). For Dutch (nl_NL), I’ve been going over quite some items together with the Apple contributor and someone else, so expect quite some improvements on that area. The current release date is aimed at somewhere in March 2010.

Office 2010 Chinese language pack font list

It looks like the Chinese Office 2010 font list is the following (Changzhou SinoType, Founder, Microsoft, Stone):

  • FZShuTi
  • FZYaoTi
  • LiSu
  • Microsoft YaHei
  • Microsoft YaHei Bold
  • STCaiyun
  • STFangsong
  • STHupo
  • STKaiti
  • STLiti
  • STSong
  • STXihei
  • STXingkai
  • STXinwei
  • STZhongsong
  • YouYuan

From the language pack make sure to select 国际字体 (international fonts) and 校对工具 (proofing tools). Under 国际字体 we have 典型字体 (typical fonts) and under 校对工具 we have 简体中文校对工具 (Simplified Chinese proofing tools) and 英语校对工具 (English proofing tools).

Office 2010 Japanese language pack font list

It looks like the Japanese Office 2010 font list is the following (all by RICOH):

  • HGGothicE
  • HGGothicM Medium
  • HGGyoshotai Medium
  • HGKyokashotai Medium
  • HGMaruGothicMPRO
  • HGMinchoB Bold
  • HGMinchoE
  • HGPGothicE
  • HGPGothicM Medium
  • HGPGyoshotai Medium
  • HGPKyokashotai Medium
  • HGPMinchoB Bold
  • HGPMinchoE
  • HGPSoeiKakugothicUB
  • HGPSoeiKakupoptai
  • HGPSoeiPresence EB Extra-Bold
  • HGSeikaishotaiPRO
  • HGSGothicE
  • HGSGothicM Medium
  • HGSGyoshotai Medium
  • HGSKyokashotai Medium
  • HGSMinchoB Bold
  • HGSMinchoE
  • HGSoeiKakugothicUB
  • HGSoeiKakupoptai
  • HGSoeiPresenceE Extra-Bold
  • HGSSoeiKakugothicUB
  • HGSSoeiKakupoptai
  • HGSSoeiPresence EB Extra-Bold

Microsoft Office 2010, typography, and proofing tools

Microsoft has released Office 2010 as a beta that you can use up to and including October 2010 (scheduled to be released in June 2010). You can download it as either 32 or 64 bit, although it seems the 64 bit download is a bit hidden since many buttons for downloading seem to lead to the default 32 bit download. If you follow the link at the Professional Plus site to ‘Get It Now’ you should be presented with links to both versions. At the moment Microsoft supports Chinese (Simplified), English, French, German, Japanese, Russian, and Spanish. If you are like me you just use the application in English, but then miss some of the proofing tools for, say, Japanese.

You can download language packs from the Microsoft Download Center. If you change the language to, say, Japanese you are presented with two download links at the bottom for the Japanese language pack. This language pack includes user interface changes for Japanese as well as proofing tools, OCR support, and fonts.

Once the pack is downloaded just run it and you can customize want you want to install. Since I am not interested in the UI aspects of the pack, I selected the top part and toggled selection for all to not install. Then for the entries 国際フォント (international fonts) and 文章校正ツール (proofing tools) I made sure to install everything. 文章校正ツール includes both 日本語用校正ツール and 英語用校正ツール and I guess you can most likely skip 英語用校正ツール since it is already installed. 国際フォント includes 標準フォント (standards font), which I am guessing is related to JIS X standards for font encodings.

Basic Windows 7 has 134 fonts installed. A basic English Office 2010 install increases this to 198 fonts installed. Installing the Japanese language pack proofing tools with fonts brings this to 228 fonts installed.

If you press the expansion arrow at the bottom-right of the Home part of the ribbon (or press CTRL-D) you will get the Font dialog. If you select the Advanced tab you can turn on features such as OpenType ligatures. This will mean that with text such as ‘fl’ or ‘ffi’ certain parts of the letters will connect instead of showing white space between the letters. This is the same technique used in printed media such as books.

Update: Michael Hendry was kind enough to point out that I was mistaking 標準 with (standard/default) with 基準 (standards/JIS/ISO).

James Cameron and his drive

A friend of mine pointed me to this article about James Cameron and his latest film “Avatar”. I am personally much inspired by such things and I hope I share at least some minor part of this kind of zeal in delivering perfectionist accomplishments. I love how he hired experts from different areas of expertise to work on the language, flora, or other parts of his fantasy world, all in all to make the world more consistent. This is the bread and butter of making an experience fully immersive. Sure, it might be wasted on the audience who just goes to watch the movie, but people like myself appreciate this. I am not sure how many experience this, but whenever I play a game where I notice that some design has been reused, watch or read something where I notice the consistency is off I feel kind of let down. I guess it is hard for me to understand why other people would not go the extra mile to avoid such problems.

Inner Universe

Ангелы и демоны кружили надо мной
Разбивали тернии и звёздные пути
Не знает счастья только тот,
Кто его зова понять не смог…

Mana du vortis, Mana du vortis
Aeria gloris, Aeria gloris
Mana du vortis, Mana du vortis
Aeria gloris, aeria gloris

I am Calling Calling now, Spirits rise and falling
Собой остаться дольше…
Calling Calling, in the depth of longing
Собой остаться дольше…

Mana du vortis, Mana du vortis
Aeria gloris, Aeria gloris

Stand alone… Where was life when it had a meaning…
Stand alone… Nothing’s real anymore and…

…Бесконечный бег…
Пока жива я могу стараться на лету не упасть,
Не разучиться мечтать…любить…
…Бесконечный бег…

Calling Calling, For the place of knowing
There’s more that what can be linked
Calling Calling, Never will I look away
For what life has left for me

Yearning Yearning, for what’s left of loving
Собой остаться дольше…
Calling Calling now, Spirits rise and falling…
Собой остаться дольше…
Calling Calling, in the depth of longing…
Собой остаться дольше…

Mana du vortis, Mana du vortis
Aeria gloris, Aeria gloris
Mana du vortis, Mana du vortis
Aeria gloris, aeria gloris

Inner Universe by ORIGA

Character encoding in mailcap for mutt and w3m

I use mutt on my FreeBSD system to read my mail. To read HTML mail I simply use a .mailcap file with an entry such as

text/html;      w3m -dump %s; nametemplate=%s.html; copiousoutput

This in effect dumps the HTML using w3m to a text file in order to safely display it. The problem that I had is that, because some emails that I receive are from a Japanese translators list, they are in Shift_JIS. When dumped w3m doesn’t properly detect the Shift_JIS encoding and as such the resulting output becomes garbled.

When I looked at the attachments in the mail with mutt’s ‘v’ command I saw that mutt at least knows the encoding of the attachment, so I figured that there should be a way of using this information with my mailcap. Turns out that there is indeed a way to do so, namely the charset variable. It turns out the mailcap format is a full RFC. RFC 1524 to be exact. Mutt furthermore uses the Content-Type headers to pull any specific settings into mailcap variables. So a Content-Type: text/html; charset=shift_jis means that %{charset} in the mailcap file will be expanded to shift_jis. We can use this with w3m’s -I flag to set a proper encoding prior to dumping.

text/html;      w3m -I %{charset} -dump %s; nametemplate=%s.html; copiousoutput

As such you can be relatively sure that the dumped text will be in the appropriate encoding. Of course it depends on a properly set Content-Type header, but if you cannot depend on that one you need to dig out the recovery tools already.

Why using ‘lorem ipsum’ is bad for web site testing

The typesetting and webdesign industry has apparently been using the ‘lorem ipsum’ text for a while to provide a dummy text in order to test print and layout.

Aside from the fact that the text is a cut off section of Cicero’s de finibus bonorum et malorum, it also fails in one huge aspect, namely globalisation.

The text is Latin, latin is the simplest of all characters we have available to us on the world-wide web. If your website is English only then, yes, you are quite done. However for a lot of us we also have to support languages other than English, the easiest of which are Latin-derived scripts.

Latin, and subsequently English, are both written left-to-right. Hebrew and Arabic, to take two prime examples, are written right-to-left (leaving numerals aside for the moment). Of course, this is very important to also test since it means a lot of change is needed for your lay out.

Especially when testing your design for sites that need to display multiple languages on the same page it is pertinent to test with multilingual text. One of the things that should quickly become clear is whether or not a sufficient encoding has been chosen.