And if this isn't possible, what is the best practice for dealing with man pages derived from UTF-8-encoded POD?
The first thing to do in order to work with Unicode in POD is to use the directive
=encoding UTF-8
(as discussed here). The pod2text
and pod2html
tools will work fine and produce perfect UTF-8-encoded output.
The pod2man
tool, however, does not:
pod2man -u MyModule.pm | nroff -Tutf8 -man | less
Neither does perldoc
. Non-ASCII characters are all mangled or X-ed out. There is some inconclusive discussion on perlbug on whether this might be a bug in pod2man
or **roff*.
Since my module deals with Unicode specifically and is intended for distribution on CPAN, Unicode-enabled man pages are a must.
I am using Perl 5.14.2, perldoc 3.15, and *roff 1.21.
All of
perldoc
,pod2man
,nroff
can be made to handle Unicode UTF-8 characters correctly. Unfortunately the Perl installers such as Build.PL and the cpan program can't yet. So unless you do some fiddling by hand during the installation the installed man pages will be broken.These all work correctly for my minimal example:
nroff
only works when you pass the input encoding (-K
) through to groff as well (source); you have to protect it with the end-of-options-
switch.This is nice. However, most users will want to install the documentation and later consult it with
man MyModule
orperldoc MyModule
. In the case ofperldoc
, your options are to either use a very recent version (3.16) or the-t
switch.In the case of
man
, if you use Build.PL (Module::Build) to install a module you can repair the broken generated docs just before the installation:Lovely! Now you can view the man page with
man MyModule
.If you use cpan to install the module your man pages will be broken. (You can try the same workaround on your local CPAN build directory, eg.
~/.cpan/build
, which should also work.)