I want to test high-intensive write between InnoDB and MyRock engine of the MySQL database. For this purpose, I use sysbench to benchmark. My requirements are:
- multiple threads concurrency write to the same table.
- support batch insert (each insert transaction will insert bulk of records)
I check all pre-made tests of sysbench and I don't see any tests that satisfy my requirements.
oltp_write_only: supports multiple threads that write to the same table. But this test doesn't have bulk insert option.bulk_insert: support multiple threads, but each thread writes to a different table.
Are there any pre-made sysbench tests satisfied my requirement? If not, can I find custom Lua scripts somewhere which already are done this?
(from Comment:)
CREATE TABLE IF NOT EXISTS `tableA` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`user_id` VARCHAR(63) NOT NULL DEFAULT '',
`data` JSON NOT NULL DEFAULT '{}',
PRIMARY KEY (`id`),
UNIQUE INDEX `user_id_UNIQUE` (`user_id` ASC)
) ENGINE = InnoDB;
(From a MySQL point of view...)
idand the PK -- saves 8 bytes per row.UNIQUE(user_id)toPRIMARY KEY(user_id)-- might save 40 bytes per row (depends onLENGTH(user_id)).Doing those will
Run OS monitoring tools to see what percentage of the I/O is being consumed. That is likely to be the limiting factor.
Benchmarking products are handy for limited situations. For your situation (and many others), it is best to build your product and time it.
Another thought...
What does the JSON look like? If the JSON has a simple structure (a consistent set of key:value pairs), then the disk footprint might be half as much (hence speed doubling) if you made individual columns. The processing to change from JSON to individual columns would be done in the client, which may (or may not) cancel out the savings I predict.
If the JSON is more complex, there still might be savings by pulling out "columns" that are always present.
If the JSON is "big", then compress it in the client, then write to a
BLOB. This may shrink the disk footprint and network bandwidth by a factor of 3.You mentioned 250GB for 250M rows? That's 1000 bytes/row. That means the JSON averages 700 bytes? (Note: there is overhead.) Compressing the JSON column into a
BLOBwould shrink to maybe 400 bytes/row total, hence only 100GB for 250M rows.{"b": 100}takes about 10 bytes. If b could be stored in a 2-byteSMALLINTcolumn, that would shrink the record considerably.Another thing: If you promote
user_idto PK, then this is worth considering: Use a file sort to sort the table by user_id before loading it. This is probably faster thanINSERTingthe rows 'randomly'. (If the data is already sorted, then this extra sort would be wasted.)