Key value oriented database vs document oriented database

212 Views Asked by At

I have recently started learning NO SQL databases and I came across Key-Value oriented databases and Document oriented databases. Since they have a similar structure, aren't they saved and retrieved the exact same way? And if that is the case then why do we define them as separate types? Otherwise, how they are saved in the file system?

1

There are 1 best solutions below

0
amirouche On

To get started it is better to pin point the least wrong vocabulary. What used to be called nosql is too broad in scope, and often there is no intersection feature-wise between two database that are dubbed nosql except for the fact that they somehow deal with "data". What program does not deal with data?! In the same spirit, I avoid the term Relational Database Management System (RDBMS). It is clear to most speakers and listeners that RDBMS is something among SQL Server, some kind of Oracle database, MySQL, PostgreSQL. It is fuzzy whether that includes SQLite, that is already an indicator, that "relational database" ain't the perfect word to describe the concept behind it. Even more so, what people usually call nosql never forbid relations. Even on top of "key-value" stores, one can build relations. In a Resource Description Framework database, the equivalent of SQL rows are called tuple, triple, quads and more generally and more simply: relations. Another example of relational database are database powered by datalog. So RDBMS and relational database is not a good word to describe the intended concepts, and when used by someone, only speak about the narrow view they have about the various paradigms that exists in the data(base) world.

In my opinion, it is better to speak of "SQL databases" that describe the databases that support a subset or superset of SQL programming language as defined by the ISO standard.

Then, the NoSQL wording makes sense: database that do not provide support for SQL programming language. In particular, that exclude Cassandra and Neo4J, that can be programmed with a language (respectivly CQL and Cypher / GQL) which surface syntax looks like SQL, but does not have the semantic of SQL (neither a superset, nor a subset of SQL). Remains Google BigQuery, which feels a lot like SQL, but I am not familiar enough with it to be able to draw a line.

Key-value store is also fuzzy. memcached, REDIS, foundationdb, wiredtiger, dbm, tokyo cabinet et. al are very different from each other and are used in verrrrrrrrrrry different use-cases.

Sorry, document-oriented database is not precise enough. Historically, they were two main databases, so called document database: ElasticSearch and MongoDB. And those yet-another-time, are very different software, and when used properly, do not solve the same problems.

You might have guessed it already, your question shows a lack of work, and as phrased, and even if I did not want to shave a yak regarding vocabulary related to databases, is too broad.

Since they have a similar structure,

No.

aren't they saved and retrieved the exact same way?

No.

And if that is the case then why do we define them as separate types?

Their programming interface, their deployment strategy and their internal structure, and intended use-cases are much different.

Otherwise, how they are saved in the file system?

That question alone is too broad, you need to ask a specific question at least explain your understanding of how one or more database work, and ask a question about where you want to go / what you want to understand. "How to go from point A-understanding (given), to point B-understanding (question)". In your question point A is absent, and point B is fuzzy or too-broad.

Moar:

  • First, make sure you have solid understanding of an SQL database, at the very least the SQL language (then dive into indices and at last fine-tuning). Without SQL knowledge, your are worthless on the job market. If you already have a good grasp of SQL, my recommendation is to forgo everything else but FoundationDB.

  • If you still want "benchmark" databases, first set a situation (real or imaginary) ie. a project that you know well, that requires a database. Try to fit several databases to solve the problems of that project.

Lastly, if you have a precise project in mind, try to answer the following questions, prior to asking another question on database-design:

  • What guarantees do you need. Question all the properties of ACID: Atomic, Consistent, Isolation, Durability. Look into BASE. You do not necessarily need ACID or BASE, but it is a good basis that is well documented to know where you want / need to go.

  • What is size of the data?

  • What is the shape of the data? Are they well defined types? Are they polymorphic types (heterogeneous shapes)?

  • Workload: Write-once then Read-only, mostly reads, mostly writes, a mix of both. Answer also the question how fast or slow can be writes or reads.

  • Querying: How queries look like: recursive / deep, columns or rows, or neighboor hood queries (like graphql and SQL without recursive queries do). Again what is the expected time to response.

Do not forgo to at least the review deployement and scaling strategies prior to commit to a particular solution.

On my side, I picked up foundationdb because it is the most versatile in those regards, even if at the moment it requires some code to be a drop-in replacement for all postgresql features.