I'm trying to perform an operation on a MySQL database table using the InnoDB storage engine. This operation is an INSERT-or-UPDATE type operation where I have an incoming set of data and there may be some data already in the table which must be updated. For example, I might have this table:
test_table
+-------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| value | varchar(255) | NO | | NULL | |
+-------+--------------+------+-----+---------+----------------+
... and some sample data:
+----+-------+
| id | value |
+----+-------+
| 1 | foo |
| 2 | bar |
| 3 | baz |
+----+-------+
Now, I want to "merge" the following values:
2, qux
4, corge
My code ultimately ends up issuing the following queries:
BEGIN;
SELECT id, value FROM test WHERE id=2 FOR UPDATE;
UPDATE test SET id=2, value='qux' WHERE id=2;
INSERT INTO test (id, value) VALUES (4, 'corge');
COMMIT;
(I'm not precisely sure what happens with the SELECT ... FOR UPDATE
and the UPDATE
because I'm using MySQL's Connector/J library for Java and simply calling the updateRow
method on a ResultSet
. For the sake of argument, let's assume that the queries above are actually what are being issued to the server.)
Note: the above table is a trivial example to illustrate my question. The real table is more complicated and I'm not using the PK as the field to match when executing SELECT ... FOR UPDATE
. So it's not obvious whether the record needs to be INSERTed or UPDATEd by just looking at the incoming data. The database MUST be consulted to determine whether to use an INSERT/UPDATE.
The above queries work just fine most of the time. However, when there are more records to be "merged", the SELECT ... FOR UPDATE
and INSERT
lines can be interleaved, where I cannot predict whether SELECT ... FOR UPDATE
or INSERT
will be issued and in what order.
The result is that sometimes transactions deadlock because one thread has locked a part of the table for the UPDATE
operation and is waiting on a table lock (for the INSERT
, which requires a lock on the primary-key index), while another thread has already obtained a table lock for the primary key (presumably because it issued an INSERT
query) and is now waiting for a row-lock (or, more likely, a page-level lock) which is held by the first thread.
This is the only place in the code where this table is updated and there are no explicit locks currently being obtained. The ordering of the UPDATE
versus INSERT
seems to be the root of the issue.
There are a few possibilities I can think of to "fix" this.
- Detect the deadlock (MySQL throws an error) and simply re-try. This is my current implementation because the problem is somewhat rare. It happens a few times per day.
- Use
LOCK TABLES
to obtain a table-lock before the merge process andUNLOCK TABLES
afterward. This evidently won't work with MariaDB Galera -- which is likely in our future for this product. - Change the code to always issue
INSERT
queries first. This would result in any table-level locks being acquired first and avoid the deadlock.
The problem with #3 is that it will require more complicated code in a method that is already fairly complicated (a "merge" operation is inherently complex). That more-complicated code also means roughly double the number of queries (SELECT
to determine if the row id already exists, then later, another SELECT ... FOR UPDATE
/UPDATE
to actually update it). This table is under a reasonable amount of contention, so I'd like to avoid issuing more queries if possible.
Is there a way to force MySQL to obtain a table-level lock without using LOCK TABLES
? That is, in a way that will work if we move to Galera?
I think you may be able to do what you want by acquiring a set of row and gap locks:
The
SELECT
query will lock the rows that already exist, and create gap locks for the rows that don't exist yet. The gap locks will prevent other transactions from creating those rows.