I have come across a peculiarity in a plperl stored procedure on Postgres 9.2 with Perl 5.12.4.
The curious behavior can be reproduced using this "broken" SP:
CREATE FUNCTION foo(VARCHAR) RETURNS VARCHAR AS $$
my ( $re ) = @_;
$re = ''.qr/\b($re)\b/i;
return $re;
$$ LANGUAGE plperl;
When executed:
# select foo('foo');
ERROR: Unable to load utf8.pm into plperl at line 3.
BEGIN failed--compilation aborted.
CONTEXT: PL/Perl function "foo"
However, if I move the qr//
operation into an eval, it works:
CREATE OR REPLACE FUNCTION bar(VARCHAR) RETURNS VARCHAR AS $$
my ( $re ) = @_;
eval "\$re = ''.qr/\\b($re)\\b/i;";
return $re;
$$ LANGUAGE plperl;
Result:
# select bar('foo');
bar
-----------------
(?^i:\b(foo)\b)
(1 row)
Why does the eval bypass the automatic
use utf8
?Why is
use utf8
even required in the first place? My code is not in UTF8, which is said to be the only time one shoulduse utf8
.If anything, I might expect the
eval
version to break withoutuse utf8
, in the case where the input to the script contained non-ASCII values. (Further testing shows that passing non-ASCII values to bar() does indeed cause the eval to fail with the same error)
Note that many Postgres installations automatically load 'utf8' on startup of the perl interpreter. This is the default in Debian at least, as demonstrated by executing
DO 'elog(WARNING, join ", ", sort keys %INC)' language plperl;
:
WARNING: Carp.pm, Carp/Heavy.pm, Exporter.pm, feature.pm, overload.pm, strict.pm, unicore/Heavy.pl, unicore/To/Fold.pl, unicore/lib/Perl/SpacePer.pl, utf8.pm, utf8_heavy.pl, vars.pm, warnings.pm, warnings/register.pm
CONTEXT: PL/Perl anonymous code block
DO
But not so on the machine demonstrating the odd behavior:
WARNING: Carp.pm, Carp/Heavy.pm, Exporter.pm, feature.pm, overload.pm, overloading.pm, strict.pm, vars.pm, warnings.pm, warnings/register.pm
CONTEXT: PL/Perl anonymous code block
DO
This question is not about how to get my target machine to load utf8 automatically; I know how to do that. I'm curious why it seems to be necessary in the first place.
In the verison that's failing, you're executing
In the version that's succeeding, you're executing
Sounds like qr// needs utf8.pm when the pattern was compiled as a Unicode pattern (whatever that means), but the latter isn't compiled as a Unicode pattern.
The failure to load utf8.pm is due to the limitations imposed by the Safe compartment created by plperl.
The fix is to load the module outside the Safe compartment.
The workaround is to use the more efficient