Category Archives: Email

Character encoding in mailcap for mutt and w3m

I use mutt on my FreeBSD system to read my mail. To read HTML mail I simply use a .mailcap file with an entry such as

text/html;      w3m -dump %s; nametemplate=%s.html; copiousoutput

This in effect dumps the HTML using w3m to a text file in order to safely display it. The problem that I had is that, because some emails that I receive are from a Japanese translators list, they are in Shift_JIS. When dumped w3m doesn’t properly detect the Shift_JIS encoding and as such the resulting output becomes garbled.

When I looked at the attachments in the mail with mutt’s ‘v’ command I saw that mutt at least knows the encoding of the attachment, so I figured that there should be a way of using this information with my mailcap. Turns out that there is indeed a way to do so, namely the charset variable. It turns out the mailcap format is a full RFC. RFC 1524 to be exact. Mutt furthermore uses the Content-Type headers to pull any specific settings into mailcap variables. So a Content-Type: text/html; charset=shift_jis means that %{charset} in the mailcap file will be expanded to shift_jis. We can use this with w3m’s -I flag to set a proper encoding prior to dumping.

text/html;      w3m -I %{charset} -dump %s; nametemplate=%s.html; copiousoutput

As such you can be relatively sure that the dumped text will be in the appropriate encoding. Of course it depends on a properly set Content-Type header, but if you cannot depend on that one you need to dig out the recovery tools already.

Email threading and breaking it

One thing that has been annoying me over the past years is that on mailinglists people with Outlook or Outlook Express clients tend to start a new thread by replying to an email in another thread. They remove the body, perhaps some cc: information, change the subject, write their body and send it off. Since Outlook supports completion for to: and cc: fields, it seems a bit of a timewaster to start a new topic like that.

But leaving all that aside, the worst part is that Outlook doesn’t show you all headers, in particular it leaves the References header intact, which means it is now a reply to an earlier post in another thread. And so threading is broken. You might think “why be annoyed over it”, well, the problem is that online mailinglist indexes use this information for proper navigating through a thread. Typically they provide a ‘next by thread’ and ‘previous by thread’ hyperlink to navigate, but the logical flow of the thread is now broken.

So whenever I see someone with Outlook do this, I send them a note about this in private and most often they adjust the way they work, since, like I stated earlier, starting a new message is actually even faster.

Lightning 0.8 released

For those of you using Thunderbird and want a calendaring option inside of Thunderbird to communicate properly with people using Outlook or Lotus Notes that send you invitations for meetings and the like: Mozilla’s Lightning is now at version 0.8. Lightning is an add-on for Thunderbird based on Sunbird.

If you then also use the Provider for Google Calendar you can synchronise your Google Calendar with your Lightning setup.