All of the keys in rocksDB's level 1 are already sorted. Therefore we can get key quickly in this level. Why does rocksDB still need to compact the files in level 1 to level 2?
I find an explanation on LevelDB's doc: Open file in one directory is slow if too many files are in the same directory. However, as mentioned in the doc, we can use sharding to solve this problem. I think sharding is much eaiser than compaction. Am I right?
Thanks in advance!
Multiple levels could reduce write amplification.
Because the data on
level 0are not sorted, so it's likely overlap with majority of data onlevel 1. Which means compaction fromlevel0 -> level1needs to compact majority of the data, which happens pretty often. If you have 1 more level:level2, then >90% data is on level 2, solevel0->level1compaction is going to much smaller, which reduces write amplification from compaction. The more level you have, in generally, the smaller the write amplification is, but higher the read amplification is, as it's need to read more levels.It's not just RocksDB, but for all LSM-tree. And in RocksDB you can actually configure the number of level to be 2, by setting
num_levels, which we don't recommend for normal usecase: https://github.com/facebook/rocksdb/blob/0ed8cb666de61d2e18bbb1fc90e05b734cee02de/include/rocksdb/advanced_options.h#L433-L434