INSERT INTO SELECT takes long time on cluster

3.5k Views Asked by At

My mysql cluster: Ver 5.6.30-76.3-56 for debian-linux-gnu on x86_64 (Percona XtraDB Cluster (GPL), Release rel76.3, Revision aa929cb, WSREP version 25.16, wsrep_25.16)

I've a complicated sql query which inserts for about 36k rows into a table with this syntax:

INSERT INTO `sometable` (SELECT ...);

The select is a bit complicated but not slow (0.0023s) but the insert takes about 40-50s. The table is not in use when I'm inserting the rows.

My questions are:

  • Can I speed it up somehow?
  • The slow insert causes locking problems on the other tables (because of select)
  • This workflow is good or bad practice? Is there any better?

Thanks

UPDATE:

The table schema:

CREATE TABLE `sometable` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `user_id` int(11) unsigned DEFAULT NULL,
  `a` varchar(255) DEFAULT NULL,
  `b` smallint(6) unsigned DEFAULT NULL,
  `c` smallint(6) unsigned DEFAULT NULL,
  `d` smallint(6) unsigned DEFAULT NULL,
  `e` smallint(6) unsigned DEFAULT NULL,
  `f` varchar(255) DEFAULT '',
  `country_id` int(10) unsigned DEFAULT NULL,
  `city_id` int(10) unsigned DEFAULT NULL,
  `g` smallint(6) unsigned DEFAULT NULL,
  `h` smallint(6) unsigned DEFAULT NULL,
  `i` smallint(6) unsigned DEFAULT NULL,
  `j` smallint(6) unsigned DEFAULT NULL,
  `k` smallint(6) unsigned DEFAULT NULL,
  `l` varchar(3) DEFAULT NULL,
  `m` varchar(3) DEFAULT NULL,
  `n` text,
  `o` varchar(255) DEFAULT NULL,
  `p` varchar(32) DEFAULT NULL,
  `q` varchar(32) DEFAULT NULL,
  `r` varchar(32) DEFAULT NULL,
  `s` time DEFAULT NULL,
  `t` time DEFAULT NULL,
  `u` text,
  PRIMARY KEY (`id`),
  KEY `user_id` (`user_id`),
  KEY `country_id` (`country_id`),
  KEY `city_id` (`city_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

UPDATE2:

When I try to run the query I get an error in some cases:

ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

MY SOLUTION:

Here is my final solution if somebody interested in: gist

The main problem was that while I fill mytable the other queries are stuck and the cluster had serious performance problems. In this solution I create a temporary table and fill it with data in "dirty read" mode, then I copy these data to mytable in chunks so it takes a bit more time but there is no performance problem and not stuck the queries.

2

There are 2 best solutions below

3
On BEST ANSWER

A SELECT operation that returns a row of the length you describe every 64 nanoseconds is very fast. That's what 36 kilorows in 2.3 milliseconds works out to. It seems likely that your SELECT query timing doesn't account for the transport of the result set to the MySQL client. At any rate, using that performance as a comparison to an INSERT operation sets your expectations unreasonably high.

You might try issuing this command before starting your operation. It will allow your SELECT operation to proceed with fewer contentions with your application's traffic on the source tables for the SELECT. See here https://dev.mysql.com/doc/refman/5.7/en/set-transaction.html

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

You might try a two step process, involving a temporary table. This will have the advantage of not having to update all the indexes in some_table at the same time as the SELECT operation. That operation will look like this.

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
CREATE TEMPORARY TABLE insert_batch AS SELECT ... ;
INSERT INTO some_table SELECT * FROM insert_batch;
DROP TEMPORARY TABLE insert_batch;

You should understand that InnoDB posts your batch of insertions to your table as a single transaction. If you can do this in a way that handles about 500 rows at a time rather than 36K, you'll have more transactions, but they will be smaller. That's generally a way to get higher throughput.

1
On

If all else fails, this may be a viable solution. First, see http://mysql.rjweb.org/doc.php/deletebig#deleting_in_chunks

  1. Load your corrections into a temp table (or non-replicated MyISAM table).
  2. Loop through the temp table (using code similar to that link). Pick 100 rows at a time.
  3. Do the INSERT ... SELECT ... of 100 rows in a separate transaction.

This technique may (or may not) take longer than 40-50s, but at least is much less likely to timeout or deadlock.

In general, avoid running any transaction that lasts longer than a few seconds. This link is somewhat generic on how to "chunk" lengthy (and repetitive) operations to avoid long transactions.