Assume I have a collection of .json files in a folder, each of which represents exactly one book, i.e.
- Book 1 in folder/1.json
- Book 2 in folder/2.json
- …
The files are between only 6kb and 3mb in size each, but there are millions of them!
Now I do the following with MariaDb:
CREATE OR REPLACE TABLE books (
isbn CHAR(13) NOT NULL,
title VARCHAR(256) NULL
) ENGINE=CONNECT TABLE_TYPE=JSON OPTION_LIST="PRETTY=2,JMODE=1,MAPPING=1" MULTIPLE=1 FILE_NAME="folder/*.json" READONLY=1;
and then a simple query such as
SELECT COUNT(*) FROM books;
This works fine if there are only a few files in the folder. However, as soon there are even just a few thousands of them, it does not work anymore: The client crashes with Lost connection to MySQL server during query and the server logs are not really helpful (to me).
I guess this is a memory problem, but I do not understand why it happens: It does not seem necessary to load each file/book object into memory; the server could just go each file one-by-one, right?
Things I tried without success
PRETTY=0instead ofPRETTY=2HUGE=1set global connect_work_size=999999999(or so)MAPPING=0instead ofMAPPING=1max_allowed_packet=1024Min my.ini
Any help is much appreciated.
Edit:
The files are just ordinary object JSON files, such as
{"timestamp":1564002510168,"isbn":"9781081958077","author":"Mark Twain", "title": "The Adventures of Tom Sawyer: Annotated"} (and some further attributes).