My exposure to NoSQL or NewSQL/NeoSQL database servers is extremely limited, only theoretical. I've worked with traditional RDBMSs (like MySQL, Postgres) and directory-server (OpenLDAP), with and without replication.
My application stack is based on JBoss, and I've been tasked with setting up a minimal demo (with our application) that can demonstrate durability and high-availability of data, in VoltDB. Performance testing, is not an objective at all.
Have been going thru the VoltDB Planning Guide, but I am confused between the "+1" or "x2" in terms of number of servers (or VoltDB instances) required. Especially given these 2 statements:-
The easiest way to size hardware for a K-Safe cluster is to size the initial instance of the database, based on projected throughput and capacity, then multiply the number of servers by the number of replicas you desire (that is, the K-Safety value plus one).
Rule of Thumb
When using K-Safety, configure the number of cluster nodes as a whole multiple of the number of copies of the database (that is, K+1)
Questions:
- Now, let's say that I need 1 server given capacity/throughput requirements. So, to be able to have durability and high-availability, do I need: 2, 3 or 4 servers ?
- OTOH, using just 1 server, what all key features of VoltDB would I have to forgo ?
- Is there any relationship (or conflict) between VoltDB's full disk-persistence and snapshots ? Say, the availability of disk-persistence removes the need for snapshots ?
If you use 2 servers, you can keep a synchronous replica of data to protect from data loss, much like a RAID1 hard drive. Your data is double-safe, but there is a catch with availability. With only two servers, it's impossible to differentiate a network split from a failed node. In some cases, VoltDB will shut down a live node when another fails to ensure there will be no split brain. With 3 nodes, this won't be an issue and the cluster will remain available after any single node failure (with k=1 or k=2).
With just 1 server, all you lose is the multiple copies of data on multiple servers and the high-availability features that allow VoltDB to continue running after a node failure. You still have all of the other VoltDB features, including full disk persistence.