Apache HBase reads slow with thousands of columns

161 Views Asked by At

I have an Apache Hbase cluster running in AWS EMR. The database consists of a single table, with strings for rows and columns and integers in the values. The table is wide, with 50,000 columns and about 75,000 rows. All columns are under a single column family.

rowkey  col1 col2 col3 ... col50000
rowkey1  0    255  456
rowkey2  ..   ...
rowkey3

The only operations I want to perform is to select subsets of this matrix - select certain rows and columns and return them. However, even selecting a single row is incredibly slow - it takes around 10 seconds to return. The documentation and case studies promise milisecond latency - what am I doing wrong?

0

There are 0 best solutions below