RegexStringComparator BigTable Matching A Binary Number

206 Views Asked by At

In BigTable, when using the RegexStringComparator, is it possible to match a number in binary format. For example, suppose a Row Key has a number in it, but to save space and to have a predictable length, that number is stored as a 4 byte value rather than a separate character for each printable digit. Is it then possible to use the RegexStringComparator to match on the number?

Specifically, let's say I want to match on either of two integers A or B, then the regex might look like this...

.*(A|B)

To be more specific, let's say that A=284281344 which is 0x10f1ca00

.*((\\x10\\xf1\x\xca\\x00 | B)

I am finding that this does not seem to be possible due to higher valued bytes (perhaps non-ascii) such as 0xF1 which does not match.

Any recommendations or thoughts?

1

There are 1 best solutions below

1
On

Can you try using the \C escape sequence (match a single byte even in UTF-8 mode)? Bigtable uses re2 regex flavor as described here: https://github.com/google/re2/wiki/Syntax