HBASE Record limit in scan api

7.5k Views Asked by At

Is there any java api to limit the number of scanned records after using start and stop rows? Is pagefilter an option?

4

There are 4 best solutions below

0
On

Did you try to use the setMaxResultSize() ?

PageFilter may not give the expected results, the doc says:

this filter cannot guarantee that the number of results returned to a client are <= page size. This is because the filter is applied separately on different region servers. It does however optimize the scan of individual HRegions by making sure that the page size is never exceeded locally.

0
On

use scan.setLimit(int) method

https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setLimit-int-

Set the limit of rows for this scan. We will terminate the scan if the number of returned rows reaches this value. This condition will be tested at last, after all other conditions such as stopRow, filter, etc.

0
On

http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setCaching(int) might be able to help you. setCaching() is used to define how many results HBase should return in one RPC call.

0
On

This answer applies if you want to get a single row only

If you're using an older version of HBase where setLimit is not available, you could use stopRow instead giving it the same value as startRow and adding a trailing byte set to zero to make it inclusive, from the documentation:

Note: In order to make stopRow inclusive add a trailing 0 byte

Here is an example:

    byte[] startRow = new byte[] { (byte)0xab, (byte)0xac};
    byte[] stopRow = new byte[startRow.length + 1];
    Array.copy(startRow, 0, stopRow, 0, startRow.length);
    stopRow[stopRow.length - 1] = 0; // inclusive
    Scan scan = new Scan().setStartRow(startRow).setStopRow(stopRow);