How to choose NoSQL database engine?

654 Views Asked by At

We have a database with following parameters:

  • 30k records, 7mb in size
  • 20 inserts/second
  • 1000 updates/second
  • 1000 range selects/second, by secondary index, approx 10 rows each
  • needs at least one secondary index
  • needs some mechanism to expire keys if they are not updated for 75 secs (can be done via programmatic garbage collector but will require additional 'last_update' index and will add some load)
  • consistency is not required
  • durability is not required
  • db should be stored in memory

For now we use Redis, but it does not have secondary index and it's keys index:foo:* is too slow. Membase also does not have secondary index (as far as I know). MongoDB and MySQL memory engine have table-level locks. What engine will fit our use case?

5

There are 5 best solutions below

1
On

I think Redis does indeed support secondary indexes, but they are not implicit: you have to maintain them explicitly. A sorted set can be used for a range scan, but it is less convenient than what you could have in MongoDB or a relational database.

If you are looking for a Redis-like server with better support for range scans, I would suggest to have a look at the discontinued AlchemyDB project, or the Aerospike 3 (commercial) product (which includes part of the features of AlchemyDB).

0
On

I presume that if you can achieve the performance requirements you have specified, the fact that a DB is or is not in memory would be irrelevant.

Your performance goals are well within the capabilities of a single, non-replicated and non-sharded MongoDB instance. Mongo uses memory mapped files so all the data will be in memory but the DB will flush to disk continuously. By default Mongo uses "unsafe" mode, which eliminates much of the burden of disk I/O. It's worth considering for your use case as opposed to trying to do in application code what a DB should do for you.

Adding replicas (Mongo's term for clustering) and/or sharding will give you a chance to easily increase performance should you need to. Multiple indexes (including composite indexes), flexible querying, bulk inserts and atomic updates are nice features that help with performance and can take the burden off your application code.

0
On

There are four types of NoSQL databases. You cloud check which one meets your requirement.

1. Document storage database. The atomic unit of the document storage database is a document. Each document is a JSON. Different documents can have different schemas and contain different fields. The document storage database allows certain fields in the document to be indexed, enabling faster queries based on these fields.

2. Column Storage Database. The atomic unit of the column storage database is a column in the table, which means that the data is stored by column. Its column storage feature makes column-based queries very efficient, and because the data on each column has almost the same structure, it can better compress the data.

3. key-value storage database. It's fast. First, it's because unique keys are used, and secondly, because most key-value storage databases store data in memory (RAM), which can be accessed quickly.

4. Figure storage database. The graph storage database contains nodes that represent entities and edges that represent relationships between entities.

0
On

I was thinking he meant 7mb / record. One other option is PlayOrm on top of Cassandra using it's Scalable-SQL...more machines can result in better performance as the disks would be working in parallel on the range scans and such.

0
On

Try to using http://tarantool.org/ It has secondary indexes and it completely in-memory. Also it uses fast asynchronous IProto protocol.

It proven stable and incredibly fast.