What encoding does Outlook use for plain text messages?

2k Views Asked by At

I need to decode e-mails saved from Outlook as Text Only. Unfortunately they're not in plain ISO-8859-1 since they contain special "smart quote" characters. Does the codepage used by Outlook have a real name (that I can pass to unicode.decode() in Python) or is it just some arbitrary made-up nonsense which I'll have to manually decode? And if so, does anyone have a reference for all the "special" characters Microsoft added?

2

There are 2 best solutions below

1
On BEST ANSWER

It's quite likely that Outlook will save messages in your current locale. My guess would be Windows-1252.

Nitpick: What you call “smart quotes” is actually the way quotes are supposed to look. The quotes you've been using in your post are known as “typewriter quotes”; for mechanic typewriters, the number of keys was a major cost factor and quotes, which look very similar to one another, and the inch symbol were coalesced into a single key, aesthetics be damned.

1
On

There are many (locale-dependent) Windows code pages, so maybe worst-case it depends on the country in which the sender resides.