I have a HBase table, and I need to get the result from several ranges. For example, I may need get data from different ranges like row 1-6, 100-150,..... I know that for each scan, I can define the start row and stop row. But if I have 6 ranges, I need to do scan 6 times. Is there any way that I can get the result from multiple ranges just from one scan or from one RPC? My HBase version is 0.98.
1
There are 1 best solutions below
Related Questions in HBASE
- Apache atlas UI not showing up
- HBase Zookeeper Connection Error Docker Standalone 2.3.x and 2.4.x
- How does bulkload in databases such as hbase/cassandra/KV store work?
- How to eradicate the slowness caused due to reading rows from bigtable with hbase client in google dataflow job?
- i cant delete the specific column data by Timestamp
- hbase shell QualifierFilter is not filtering out columns when used with logical OR and SingleColumnValueFilter
- Spark - Fetch Hbase table all versions data using HBase Spark connector
- Unable to recover inconsistency in Hbase
- hBase java api, error on bulkload Added a key not lexically larger than previous sort (with JavaPairRDD<ImmutableBytesWritable, KeyValue>)
- Functionality inside completable future is completing quickly but completable future and timelimiter are taking too long
- about hbase put attribute
- java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/client/Table
- Big Table Java Connectivity issue
- How to check if the Thrift is working on HBase version 2.5 and How to indicate if Thrift 1 or Thrift 2 is installed?
- HMaster stuck at "Initialize ServerManager and schedule SCP for crash servers"
Related Questions in DATABASE-SCAN
- How to scan for a particular column value by rowkey and cell in Hbase?
- HBase Scan with Multiple Ranges
- Get values of all items in DynamoDB table and display (Android)
- Mysql optimize to avoid table scan
- Amazon DyanamoDB ,Using filter expressions with scan operations in Java
- Database for large range scans
- AWS dynamoDB scan string set IOS
- Hbase scan is returning deleted rows
- python dynamodb scan throughput error
- YCSB Error with scan operation on mysql db
- Why does scan for large blob value crash HBase cluster?
- reading from a flatfile without outputting the entire thing
- SAS Scan doesn't like the second argument (a do loop counter)
- Scan with filter returns zero results in DynamoDB
- AWS Boto: scan() unknown keyword 'limit'
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular # Hahtags
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Filter to support scan multiple row key ranges. It can construct the row key ranges from the passed list which can be accessed by each region server.
HBase is quite efficient when scanning only one small row key range. If user needs to specify multiple row key ranges in one scan, the typical solutions are:
using the SQL layer over HBase to join with two table, such as hive, phoenix etc. However, both solutions are inefficient.
Both of them can't utilize the range info to perform fast forwarding during scan which is quite time consuming. If the number of ranges are quite big (e.g. millions), join is a proper solution though it is slow.
However, there are cases that user wants to specify a small number of ranges to scan (e.g. <1000 ranges). Both solutions can't provide satisfactory performance in such case.
Please see this test case for implementing in java
Note : However, This kind of requirements SOLR or ES is the best way in my opinion... you can check my answer with solr for high level architecture overview. Im suggesting that since hbase scan for huge data will be very slow.