MySQL: Insert data from one file into two tables, one with auto_increment id, and one join table

588 Views Asked by At

I've got data with 500M+ records, in a file with 2 fields, blob and c_id.

There are also two other files with the same data in other formats: A file with 90M unique blobs. A file with a blob, and a comma-separated list of c_ids per record.

I've got two tables:

table_a: [id, blob] # id is auto-increment
table_b: [a_id, c_id]

For each unique blob, a record in table_a must be created. For each record in the file, a record in table_b must be created with the appropriate foreign key to table_a.

The solution I use now is to generate insert statements, using last_insert_id, but it's too slow. I'd prefer to use LOAD DATA INFILE, but the auto-increment id is making stuff complicated.

e.g.

# Raw data
c_id   blob
   1   aaaa
   2   aaaa
   3   aaaa
   3   aaab
   4   aaac

Desired output:

# Table_a
id  blob
 1  aaaa
 2  aaab
 3  aaac

# Table_b
c_id a_id
   1    1
   2    1
   3    2
   3    2
   4    3
1

There are 1 best solutions below

1
On

I am not sure how you are populating the "c_id" field for table_b, but you can do something like this: load all the data in table_a first, then load table_b by executing batch queries like:

"SELECT id into outfile '/tmp/file1.csv' FIELDS TERMINATED BY ';' LINES TERMINATED BY '\n' from table_a where id > 0 limit 100000"

and using load infile on '/tmp/file1.csv'.