I have an Apache Hbase cluster running in AWS EMR. The database consists of a single table, with strings for rows and columns and integers in the values. The table is wide, with 50,000 columns and about 75,000 rows. All columns are under a single column family.
rowkey col1 col2 col3 ... col50000
rowkey1 0 255 456
rowkey2 .. ...
rowkey3
The only operations I want to perform is to select subsets of this matrix - select certain rows and columns and return them. However, even selecting a single row is incredibly slow - it takes around 10 seconds to return. The documentation and case studies promise milisecond latency - what am I doing wrong?