Suppose we have a big mysql table, which has less than 10 million rows.
If I want to select all the results, obviously a full table scan works fine.
select * from table_name;
But how to make it in parallel? I found the solution in Sqoop is Split
.
select * from table_name where id >= 1 and id < 10000;
select * from table_name where id >= 10000 and id < 20000;
select * from table_name where id >= 20000 and id < 30000;
...
The problem is if the id size number
big, mysql may take it as a Full table scan
.
Update1: Slow query log
# Query_time: 600.632844 Lock_time: 0.000071 Rows_sent: 624 Rows_examined: 236584
SELECT `id`, ... FROM `table_name` WHERE ( `id` >= 647121 ) AND ( `id` <= 765101 );
Update2: Explain
+----+-------------+------------------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | table_name | range | PRIMARY | PRIMARY | 4 | NULL | 1 | Using where |
Update3: Mysql Version
+------------+
| version() |
+------------+
| 5.1.46-log |
+------------+
And if we send 3 querys like that at the same time, the server's query time is so long and we could not bear.
So, is it possible to do split query in parallel? If not, why Sqoop do like this? thx.
It looks like it is not using any keys. Are you using a rather old MySQL version?
The
EXPLAIN
should be similar to this: