Understanding KeyValue embedded datastore vs FileSystem

4.2k Views Asked by At

I have a basic question with regards to FileSystem usage I want to use a embedded KeyValue store, which is very write oriented. (persistent) Say my value size is a) 10 K b) 1 M and read and updates are equal in number

Cant I simply create files containing the value and there name acting as keys.

Wont it as fast as using a KeyValue store as LevelDB or RocksDB.

Can anybody please help me understand .

2

There are 2 best solutions below

4
On BEST ANSWER

In principle, yes, a filesystem can be used as a key-value store. The differences only come in when you look at individual use cases and limitations in the implementations.

Without going into too much details here, there are some things likely to be very different:

  • A filesystem splits data into fixed size blocks. Two files can't typically occupy parts of the same block. Common block sizes are 4-16 KiB; you can calculate how much overhead your 10 KiB example would cause. Key/value stores tend to account for smaller-sized pieces of data.
  • Directory indexes in filesystems are often not capable of efficiently iterating over the filenames/keys in sort order. You can efficiently look up a specific key, but you can't retrieve ranges without reading pretty much all of the directory entries. Some key/value stores, including LevelDB, support efficient ordered iterating.
  • Some key/value stores, including LevelDB, are transactional. This means you can bundle several updates together, and LevelDB will make sure that either all of these updates make it through, or none of them do. This is very important to prevent your data getting inconsistent. Filesystems make this much harder to implement, especially when multiple files are involved.
  • Key/value stores usually try to keep data contiguous on disk (so data can be retrieved with less seeking), whereas modern filesystems deliberately do not do this across files. This can impact performance rather severely when reading many records. It's not an issue on solid-state disks, though.
  • While some filesystems do offer compression features, they are usually either per-file or per-block. As far as I can see, LevelDB compresses entire chunks of records, potentially yielding better compression (though they biased their compression strategy towards performance over compression efficiency).
0
On

Lets try to build Minimal NoSQL DB server using Linux and modern File System in 2022, just for fun, not for serious environment.

DO NOT TRY THIS IN PRODUCTION —————————————————————————————————————————————

POSIX file Api for read write,

POSIX ACL for native user accounts and group permission management.

POSIX filename as key ((root db folder)/(tablename folder)/(partition folder)/(64bitkey)). Per db and table we can define permission for read/write using POSIX ACL. (64bitkey) is generated in compute function.

Mount BTRFS/OpenZFS/F2fs as filesystem to provide compression (Lz4/zstd) and encryption (fscrypt) as native support. F2fs is more suitable as it implements LSM which many nosql db used in their low level architecture.

Meta data is handled by filesystem so no need to implement it.

Use Linux and/or filesystem to configure page or file or disk block cache according to read write patterns as implemented in business login written in compute function or db procedure.

Use RAID and sshfs for remote replication to create Master/Slave high availability and/or backup

Compute function or db procedure for writing logic could be NodeJS file or Go binary or whatever along with standard http/tcp/ws server module which reads and write contents to DB.