Case-insensitive hash-keys in Regexp::Grammars

169 Views Asked by At

In the perl module Regexp::Grammars, consider the following token:

<token: command>       <%commands>

This token is part of a complex grammar, parsing a wide variety of different sentences.

This token matches any word in the hash %commands, which I have defined as follows (of course, outside any function):

our %commands = (
    'Basic_import'  => 1,
    'Wait'          => 1,
    'Reload'        => 1,
    'Log'           => 1,
); 

This works, for matching keywords like "Basic_import", "Wait", etc. However, I also want it to match on words like "basic_import", "wait", etc.

How do I make this hash case insensitive without having to copy and paste every keyword multiple times? Because this is part of a complex grammar, I want to use Regexp::Grammars, and I'd prefer not to have to revert to a grep for this particular exception.

3

There are 3 best solutions below

2
On BEST ANSWER

You can use Hash::Case::Preserve to make hash lookups case insensitive:

use strict;
use warnings 'all';

use Data::Dump;
use Hash::Case::Preserve;
use Regexp::Grammars;

tie my %commands, 'Hash::Case::Preserve';

%commands = (
    'Basic_import'  => 1,
    'Wait'          => 1,
    'Reload'        => 1,
    'Log'           => 1,
);

my $grammar = qr{

    <command>

    <token: command>    <%commands>

};  

dd \%/ if 'basic_import' =~ $grammar;

Output:

{ "" => "basic_import", "command" => "basic_import" }

Note that you have to tie the hash before inserting any values into it.

2
On

From the documentation, it sounds like <%commands> would match Wait of Waiting, so even a case-insensitive version of <%commands> would be less than ideal.

You normally want to match a generic identifier, and independently check if the identifier is a valid command. This is what prevents printfoo(); from being equivalent to print foo(); in Perl.

May I suggest the following:

use feature qw( fc );

our %commands = map { fc($_) => 1 } qw(
   Basic_import
   Wait
   Reload
   Log
); 

<rule: command> (<ident>) <require: (?{ $commands{fc($CAPTURE)} })>

<token: ident> \w+

You can probably get away with using lc instead of fc if you want backwards compatibility with version of Perl older than 5.16.

0
On
%commands = map { lc($_) => 1, $_ => 1 } qw(
    Basic_import
    Wait
    Reload
    Log
);