cbt deleterow of rowkeys containing space

17 Views Asked by At

I have hundreds of thousand rowkeys wrongly inserted and I have to find a way to achieve their deletion.

I was testing cbt deleterow in the linux prompt trying to reach to the correct regex to make it accept the rowkey with space, getting straight 'no-go' result, but while i was reading articles, i found one below.

According to google , [1]https://cloud.google.com/bigtable/docs/cbt-reference it says "row-key : String or raw bytes. Raw bytes must be enclosed in single quotes and have a dollar-"

I tried it to see if space can be through this rule ? The cbt deleterow I am using resemble something like below :

for x in cbt -project my-project -instance my-instace read my-table prefix=abc cells-per-column="1" columns=cf.firstattr count=3 | grep "abc"; do cbt -project my-project -instance my-instace deleterow my-table $'$x'; echo $x' deleted' done

(kudo for guys that answered this post : How to use cbt to delete range of rows with a prefix key from BigTable ; i am posting this because it did not fit my case due to rowkey containing blank characters)

Weird - it does not delete the rowkeys although $x shows formatted right :

$'ALL OF US#1234567890#'

the execution of the loop throws error messages 'bad parameter "OF".

and when i myself execute in the prompt : $cbt -project my-project -instance my-instace deleterow my-table $'ALL OF US#1234567890#'

it fetches fine the rowkey .

it seems that when executing the line iterated, single quote is stripped out and only the beginning of rowkey part (i.e. '$ALL' ) becomes the rowkey, and 'OF' remains as unexpected parameter.

Could anyone see what is the missing point here ?

thanks in advance

i want to make iteration of rowkeys that for each i want it to execute 'cbt deleterow ' be succeeded.

1

There are 1 best solutions below

0
On

I´m back to share what I found.

i was trying to use ugly ways to make it run, like piping out to a comand.sh archive , chmod 755 to it and issue a sh ./comand.sh and even so, was no go; i has confusing the context of the command prompt and the one under shebang on a shell script, there are differences (suggestion: google 'difference between executing sh and shell script') .

I was in doubt of regex I has written as could not get success yet - kept on searching, searching, reformulating question, reading, reading.

telling short, I came to this way to make it work - lookup or deleterow of those rowkeys containing space :

cbt -project aproject -instance aninstance read agiventable regex=^[0-9].20240301.$ cells-per-column="1" prefix=ROYAUME columns=CF:id_produit count=5 | grep ^ROYAUME | while read line; do echo cbt -project aproject -instance aninstance lookup agiventable $'$line' cells-per-column="1" columns=CF:id_produit | bash - ; done | grep ^ROYAUME > result_test_ROYAUME.txt

cat result_test_ROYAUME.txt

ROYAUME DU MONROI20240301#00000000000000000003#

ROYAUME DU MONROI20240301#00000000000000000004#

ROYAUME DU MONROI20240301#00000000000000000005#

ROYAUME DU MONROI20240301#00000000000000000006#

ROYAUME DU MONROI20240301#00000000000000000007#

First I´m using cbt read by a regex and prefix to set the list of rowkeys I want, and using pipe, i connected it to a while loop that i set the cbt command with the iteration variable surrounded by $' and ' characters to indicate that rowkey in variable contains raw bytes (i.e. space) .

My opinion, if it could serve as advice, I´d say 'avoid letting space into rowkey in bigtable', and if there is articles on best practices in distributing rowkeys across bigtable, read it (the example i put with sequential number in rowkey, is resembling towards a bad example; instead of 00000000000000000003, revert it , 3000000000000000000 , as you will have lower digits varying more frequently, it will be more distributed/spreaded).

I was trying to pipe cbt command into a comand.sh archive , chmod 755 on it and trying to execute it with sh - i´ve gone into other issue: environment context , sh and line command, this is a good battle but not for now hahah. What saved me was that ' | bash - ' after 'echo cbt ....', it made expected results come out, so I was not so unable to advance .

I could not write in a fancy way but I believe that for those that come into the question I had to face, I hope this can help them.