How can I use Unicode characters in reStructuredText (reST)?

182 Views Asked by At

I need to use Unicode characters in my Sphinx documentation.

When I insert the line

.. |ohat| unicode:: U+00F4 .. ohat

at the top of a file, I get a "ô" only if the "|ohat|" is surrounded by space.

But if the "|ohat|" is inside a word, it is not translated - I get "D|ohat|le" instead of "Dôle".

How can I use Unicode characters inside a word when using rst?

2

There are 2 best solutions below

0
mzjn On

Without a substitution

Just use Dôle in the document.

With a substitution

The substitution reference must be surrounded by whitespace. In order to exclude the whitespace from the output, escaping is needed:

D\ |ohat|\ le

References:

0
G. Milde On

With the :trim: option of the "unicode" directive, you can define a substitution that removes surrounding whitespace:

.. |ohat| unicode:: U+00F4
   :trim:

D |ohat| le

prints "Dôle".

To get whitespace around the replacement character put escaped spaces around the substitution D \ |ohat|\ le. There are also the :ltrim: and :rtrim: options for one-sided trimming.

There are sets of ready-made substitution definitions in the reStructuredText Standard Definition Files. This may be convenient, if you need several substitutions. E.g after

.. include:: <isolat1.txt>

you can write D\ |ocirc|\ le et no\ |euml|\ l.

(These definitions are without the "trim" option, so you need the escaped whitespace for occurences inside a word.)