I'm trying to setup CrateDB in Google Cloud for analytics through Metabase, availability is not important, the data can be reloaded, just query speed. Largest table is 50 million rows 40 columns. All tables are denormalized.
What is more beneficial for CrateDB query performance, number of nodes, CPU quantity or RAM amount?
- 6 nodes x 1 VCPU 3.75GB RAM
- 3 nodes x 2 VCPU 7.5GB RAM
- 3 nodes x 1 VCPU 15GB RAM
- 3 nodes x 4 VCPU 4GB RAM
- 1 node x 6 VCPU 22.5GB RAM
Is it better to try adding as much CPU as possible, as much RAM as possible or a balance of both?
it depends on your use case, but usually you go for a mixture. but what you described, i'd go with: 3 nodes x 4 VCPU 4GB RAM
cratedb is distributed by nature so you need to run it in a cluster to use its benefits.
if you have the possibility, use ssds. spinning discs slows cratedb down a lot.