Can I read bigtable records with cbt by time?

1.5k Views Asked by At

I want to read the last (latest) records written to bigtable with cbt. The docs don't mention which order cbt read reads records in, however. I don't know what the row key(s) would be.

Is there a way to read records with cbt by insertion time?

Update:

Here is what I see when reading a table:

cbt read table_name count=10 | grep processedTime
2021/12/17 09:20:42 -creds flag unset, will use gcloud credential
  general:processedTime                    @ 2021/06/29-14:40:04.028000
  general:processedTime                    @ 2021/06/17-12:32:04.055000
  general:processedTime                    @ 2021/06/17-12:32:40.032000
  general:processedTime                    @ 2021/06/17-12:32:43.047000
  general:processedTime                    @ 2021/06/10-18:45:53.495000
  general:processedTime                    @ 2021/06/17-12:31:28.772000
  general:processedTime                    @ 2021/06/17-12:30:41.205000
  general:processedTime                    @ 2021/06/17-12:30:33.960000
  general:processedTime                    @ 2021/06/29-14:40:17.811000
  general:processedTime                    @ 2021/06/17-12:32:06.795000
  general:processedTime                    @ 2021/06/17-12:31:49.202000

The cbt read does not give results in order by time.

Is there a way to get cbt read to order the results by time?

1

There are 1 best solutions below

5
On

Currently, cbt tool may not guarantee to return data in sorted order especially when there's a lot of data. It is possible that Bigtable takes time to organize it. However based on the documentation, Bigtable read requests in the order in which they are stored. Therefore, the latest record would have to be at the top.

Referencing the 2 testing scenarios I did. First is to insert data without specifying the timestamp, second is to insert data specifying the timestamp. Either way, I got the data in descending order based on timestamp.

Timestamp not specified:

@cloudshell:~ $ cbt set my-table r1 cf1:c1=val5
2021/12/15 10:51:52 -creds flag unset, will use gcloud credential
@cloudshell:~ $ cbt read my-table
2021/12/15 10:52:07 -creds flag unset, will use gcloud credential
----------------------------------------
r1
  cf1:c1                                   @ 2021/12/15-10:51:59.760000
    "val5"
  cf1:c1                                   @ 2021/12/15-10:26:00.471000
    "val4"
  cf1:c1                                   @ 2021/12/15-10:25:26.863000
    "val3"
  cf1:c1                                   @ 2021/12/15-10:24:58.021000
    "val2"
  cf1:c1                                   @ 2021/12/15-10:24:52.259000
    "val1"
@cloudshell:~ $ cbt read my-table cells-per-column=1
2021/12/15 10:52:17 -creds flag unset, will use gcloud credential
----------------------------------------
r1
  cf1:c1                                   @ 2021/12/15-10:51:59.760000
    "val5"

Timestamp specified:

enter image description here

Your use case might not be entirely covered by CBT tool. My suggestion is to file a feature request to their GitHub repo. No guarantees when or if it will ever be implemented.