Is this a bug or am I doing something wrong (when trying to match Russian swear words in a multiplayer game chat log) on CentOS 6.5 with the stock perl 5.10.1?
# echo блядь | perl -ne 'print if /\bбля/'
# echo блядь | perl -ne 'print if /бля/'
блядь
# echo $LANG
en_US.UTF-8
Why doesn't the first command print anything?
You have to tell Perl that the source code contains UTF-8 (
use utf8
), and that the input (-CI
) and output (-CO
) are UTF-8 encoded: