Tag Archives: japanese

Anything related to the Japanese language

Office 2010 Japanese language pack font list

It looks like the Japanese Office 2010 font list is the following (all by RICOH):

  • HGGothicE
  • HGGothicM Medium
  • HGGyoshotai Medium
  • HGKyokashotai Medium
  • HGMaruGothicMPRO
  • HGMinchoB Bold
  • HGMinchoE
  • HGPGothicE
  • HGPGothicM Medium
  • HGPGyoshotai Medium
  • HGPKyokashotai Medium
  • HGPMinchoB Bold
  • HGPMinchoE
  • HGPSoeiKakugothicUB
  • HGPSoeiKakupoptai
  • HGPSoeiPresence EB Extra-Bold
  • HGSeikaishotaiPRO
  • HGSGothicE
  • HGSGothicM Medium
  • HGSGyoshotai Medium
  • HGSKyokashotai Medium
  • HGSMinchoB Bold
  • HGSMinchoE
  • HGSoeiKakugothicUB
  • HGSoeiKakupoptai
  • HGSoeiPresenceE Extra-Bold
  • HGSSoeiKakugothicUB
  • HGSSoeiKakupoptai
  • HGSSoeiPresence EB Extra-Bold

Microsoft Office 2010, typography, and proofing tools

Microsoft has released Office 2010 as a beta that you can use up to and including October 2010 (scheduled to be released in June 2010). You can download it as either 32 or 64 bit, although it seems the 64 bit download is a bit hidden since many buttons for downloading seem to lead to the default 32 bit download. If you follow the link at the Professional Plus site to ‘Get It Now’ you should be presented with links to both versions. At the moment Microsoft supports Chinese (Simplified), English, French, German, Japanese, Russian, and Spanish. If you are like me you just use the application in English, but then miss some of the proofing tools for, say, Japanese.

You can download language packs from the Microsoft Download Center. If you change the language to, say, Japanese you are presented with two download links at the bottom for the Japanese language pack. This language pack includes user interface changes for Japanese as well as proofing tools, OCR support, and fonts.

Once the pack is downloaded just run it and you can customize want you want to install. Since I am not interested in the UI aspects of the pack, I selected the top part and toggled selection for all to not install. Then for the entries 国際フォント (international fonts) and 文章校正ツール (proofing tools) I made sure to install everything. 文章校正ツール includes both 日本語用校正ツール and 英語用校正ツール and I guess you can most likely skip 英語用校正ツール since it is already installed. 国際フォント includes 標準フォント (standards font), which I am guessing is related to JIS X standards for font encodings.

Basic Windows 7 has 134 fonts installed. A basic English Office 2010 install increases this to 198 fonts installed. Installing the Japanese language pack proofing tools with fonts brings this to 228 fonts installed.

If you press the expansion arrow at the bottom-right of the Home part of the ribbon (or press CTRL-D) you will get the Font dialog. If you select the Advanced tab you can turn on features such as OpenType ligatures. This will mean that with text such as ‘fl’ or ‘ffi’ certain parts of the letters will connect instead of showing white space between the letters. This is the same technique used in printed media such as books.

Update: Michael Hendry was kind enough to point out that I was mistaking 標準 with (standard/default) with 基準 (standards/JIS/ISO).

Character encoding in mailcap for mutt and w3m

I use mutt on my FreeBSD system to read my mail. To read HTML mail I simply use a .mailcap file with an entry such as

text/html;      w3m -dump %s; nametemplate=%s.html; copiousoutput

This in effect dumps the HTML using w3m to a text file in order to safely display it. The problem that I had is that, because some emails that I receive are from a Japanese translators list, they are in Shift_JIS. When dumped w3m doesn’t properly detect the Shift_JIS encoding and as such the resulting output becomes garbled.

When I looked at the attachments in the mail with mutt’s ‘v’ command I saw that mutt at least knows the encoding of the attachment, so I figured that there should be a way of using this information with my mailcap. Turns out that there is indeed a way to do so, namely the charset variable. It turns out the mailcap format is a full RFC. RFC 1524 to be exact. Mutt furthermore uses the Content-Type headers to pull any specific settings into mailcap variables. So a Content-Type: text/html; charset=shift_jis means that %{charset} in the mailcap file will be expanded to shift_jis. We can use this with w3m’s -I flag to set a proper encoding prior to dumping.

text/html;      w3m -I %{charset} -dump %s; nametemplate=%s.html; copiousoutput

As such you can be relatively sure that the dumped text will be in the appropriate encoding. Of course it depends on a properly set Content-Type header, but if you cannot depend on that one you need to dig out the recovery tools already.

Microsoft IME 2007 on Windows x64

So I was updating my input method editors (IME) from the default in Windows x64 (IME 2002) to the ones provided by Office 2007’s language packs. As explained in a previous post of mine you can install the proofing tools and input by passing LAUNCHEDBYSETUPEXE=1 to the execution of the MSI. Now, on my Windows x64 I installed the IME by installing the IME64.MSI with this added variable. The weird thing was that some applications worked flawlessly and yet others showed me the wrong number of icons or no icons at all! It turns out that these applications are 32-bits applications and need to have the 32-bits IME installed as well. So next to installing IME64.MSI of the language you want to install, you will also have to install IME32.MSI. Only after doing this will you notice the applications working as you want them.

Thinking back on it, it makes perfect sense, but while you are in the middle of working with it you keep wondering: “why?”

Rangaku getting form

I spent the past three days doing a lot of coding for my Dutch-Japanese dictionary site Rangaku (or better known as Kouyou). I am happy to see that things are finally starting to pull together now. I had intended to publish things much sooner, but I had a lot of catching up to do on the Python front. Partially thanks to my new job and my own endeavours in my spare time I am mastering Python at least on a level I feel I am reasonably comfortable with it. Of course, the release of various tools like Werkzeug, Genshi or SQLAlchemy helped me a lot as well.

The Elephant (象)

I will bear criticism like an elephant in battle bears an arrow from a bow. Most people are bad behaviour. (戦場の象が、射られた矢にあたっても堪え忍ぶように、われらはひとのそしりを忍ぼう。多くの人は実に性質(たち)が悪いからである。)

One can take a trained elephant even into a crowd. The king himself will ride a trained elephant. He who is disciplined is the best of men, since he can bear criticism. (馴らされた象は、戦場にも連れて行かれ、王の乗りものとなる。世のそしりを忍び、自らをおさめた者は、人々の中にあっても最上の者である。)

Trained mules are excellent, and so are thoroughbred horses from the Sindh, and so are great battle elephants, but more excellent than them all is a disciplined man. (馴らされた騾馬は良い。インダス河のほとりの血統よき馬も良い。クンジャラという名の大きな象も良い。しかし自己をととのえた人はそれらよりもすぐれている。)

There is no reaching the unattainable with mounts like these, but with himself well under control a disciplined man can get there. (何となれば、これらの乗物によっては未到の地(ニルヴァーナ)に行くことはできない。そこへは、慎しみある人が、おのれ自らをよくととのえておもむく。)

Dhammapalo, the elephant, is hard to control in rut. Even when tied up, he refuses his food. The great tusker is thinking of the elephant forest. (「財を守る者」という名の象は、発情期にこめかみから液汁をしたたらせて強暴になっているときは、いかんとも制し難い。捕らえられると、一口の食物も食べない。象は象の林を慕っている。)

Then a man is a lie-abed and over-eats, a lazy person who wallows in sleep like a great over-fed hog, a fool like that will be reborn time after time. (大食いをして、眠りをこのみ、ころげまわって寝て、まどろんでいる愚鈍な人は、大きな豚のように糧を食べて肥り、くりかえし母胎に入って(迷いの生存をつづける)。)

My mind used formerly to go off wandering wherever it felt like, following its own inclination, but today I shall control it carefully, like a mahout does a rutting elephant. (この心は、以前には、望むがままに、欲するがままに、快きがままに、さすらっていた。今やわたくしはその心をすっかり抑制しよう、___象使いが鉤をもって、発情期に狂う象を全くおさえつけるように。)

Take pleasure in being careful. Guard your mind well. Extricate yourself from the mire, like a great tusker sunk in the mud. (つとめはげむのを楽しめ。おのれの心を護れ。自己を難処から救い出せ。___泥沼に落ち込んだ象のように。)

If you find an intelligent companion, a wise and well-behaved person going the same way as yourself, then go along with him, overcoming all dangers, pleased at heart and mindful. (もしも思慮深く聡明でまじめな生活をしている人を伴侶として共に歩むことができるならば、あらゆる危険困難に打ち克って、こころ喜び、念いをおちつけて、ともに歩め。)

But if you do not find an intelligent companion, a wise and well-behaved person going the same way as yourself, then go on your way alone, like a king abandoning a conquered kingdom, or like a great elephant in the deep forest. (しかし、もしも思慮深く聡明でまじめな生活をしている人を伴侶として共に歩むことができないならば、国を捨てた国王のように、また林の中の象のように、ひとり歩め。)

It is better to travel alone. There is no companionship with a fool. Go on your way alone and commit no evil, without cares like a great elephant in the deep forest. (愚かな者を道伴れとするな。独りで行くほうがよい。孤独(ひとり)で歩め。悪いことをするな。求めるところは少なくあれ。___林の中にいる象のように。)

It is good to have companions when occasion arises, and it is good to be contented with whatever comes. Merit is good at the close of life, and the elimination of all suffering is good. (事がおこったときに、友だちのあるのは楽しい。(大きかろうとも、小さかろうとも)、どんなことにでも満足するのは楽しい。善いことをしておけば、命の終るときに楽しい。(悪いことをしなかったので)、あらゆる苦しみ(の報い)を除くことは楽しい。)

Good is filial devotion to one’s mother in the world, and devotion to one’s father is good. It is good to be a sanyasi in the world and to be a brahmin too. (世に母を敬うことは楽しい。また父を敬うことは楽しい。世に修行者を敬うことは楽しい。世にバラモンを敬うことは楽しい。)

Good is good behaviour up to old age, good is firmly established faith, good is the acquisition of understanding, and abstention from evil is good. (老いた日に至るまで戒しめをたもつことは楽しい。信仰が確立していることは楽しい。明らかな知慧を体得することは楽しい。もろもろの悪事をなさないことは楽しい。)

English translation by John Richards.
Japanese translation by 中村元 (NAKAMURA Hajime)

Office 2007 Proofing and Input Method Editors (IME)

So I have been toying with the proofing tools and input method editors (IME) from Office 2007. The issue with the single language packs is that you cannot just group the entire stuff together.

Also trying to run the MSIs from the individual directories for the proofing tools or the IMEs greets you with an ‘Error 1713′. On the other hand, if you run the MSI from the command prompt and passing along LAUNCHEDBYSETUPEXE=1 as an argument it will install. Curious.

Office 2003, Visual Basic editor and AppLocale

So I was working with a Japanese .xla (Excel add-in) file. I needed to look at something in the source so I fired up the Visual Basic editor within Excel. Upon investigating the form and the various captions it turns out that the Visual Basic editor only displayed them in gibberish (typical decoding issues) or question marks (substituting the .notdef glyph for codepoints). So it seems the Visual Basic editor is either not multi-byte capable (typing directly a string in Japanese into the caption yielded question marks) or it is bound to the locale of the system.

I then remembered AppLocale and fired up Excel through it, setting it to think it is on a Japanese system. Then within Excel I proceeded to start the Visual Basic editor and, sure enough, the text was showing me the Japanese I needed.

I am not sure if I should find this lame or understandable.

Wah Nam Hong (華南行) in Rotterdam

Here in Rotterdam we have a Chinese supermarket called in Dutch phonetic Cantonese ‘Wah Nam Hong’, which in Jyutping (waa4 naam4 hong4) stands for the hanzi 華南行. Literally translated 華南 stands for South China and matches the obvious Cantonese heritage. The stands for a profession or business line.

What is interesting to me is that in Japanese (日本語) you read 華南 as かなん and it means South China as well. However, would be こう or ぎょう and has not retained the profession/business line meaning at all.