And if this isn't possible, what is the best practice for dealing with man pages derived from UTF-8-encoded POD?
The first thing to do in order to work with Unicode in POD is to use the directive
=encoding UTF-8
(as discussed here). The pod2text and pod2html tools will work fine and produce perfect UTF-8-encoded output.
The pod2man tool, however, does not:
pod2man -u MyModule.pm | nroff -Tutf8 -man | less
Neither does perldoc. Non-ASCII characters are all mangled or X-ed out. There is some inconclusive discussion on perlbug on whether this might be a bug in pod2man or **roff*.
Since my module deals with Unicode specifically and is intended for distribution on CPAN, Unicode-enabled man pages are a must.
I am using Perl 5.14.2, perldoc 3.15, and *roff 1.21.
All of
perldoc,pod2man,nroffcan be made to handle Unicode UTF-8 characters correctly. Unfortunately the Perl installers such as Build.PL and the cpan program can't yet. So unless you do some fiddling by hand during the installation the installed man pages will be broken.These all work correctly for my minimal example:
nroffonly works when you pass the input encoding (-K) through to groff as well (source); you have to protect it with the end-of-options-switch.This is nice. However, most users will want to install the documentation and later consult it with
man MyModuleorperldoc MyModule. In the case ofperldoc, your options are to either use a very recent version (3.16) or the-tswitch.In the case of
man, if you use Build.PL (Module::Build) to install a module you can repair the broken generated docs just before the installation:Lovely! Now you can view the man page with
man MyModule.If you use cpan to install the module your man pages will be broken. (You can try the same workaround on your local CPAN build directory, eg.
~/.cpan/build, which should also work.)