Why PHP Mysql multi_query() is slower than loop query()

739 Views Asked by At

I'm working on HHVM with MySQL. And I'm really confused to find out that batching 2000 sql queries using multi_query() is much slower than a 2000 loop with single query() (please see the code and the result at end). By further profiling, I found the API next_result() takes most of the time (~70%).

My questions are:
(1) Why the API next_result() is so slow?
(2) If I want to do thousands of sql queries together, is there any way better than a naive loop?
Thanks!

Here is the code (php):

$conn = new mysqli("localhost", "root", "pwd", "table");

$loop = 2000;
$q = "select * from ContactInfo;";

// single query in a loop
$results = array();
$sq_start = microtime(true);
for ($i=0; $i < $loop; $i++) {
  $ret = $conn->query($q);
  $results[] = $ret;
}
for ($i=0; $i < $loop; $i++) {
  $xx = $results[$i]->fetch_all();
}
$sq_end = microtime(true);

// construct the multi-query
for ($i=0; $i < $loop; $i++) {
  $m_q .= $q; 
}

// multi-query in one round-trip
$mq_start = microtime(true);
$conn->multi_query($m_q);
do {
  $ret = $conn->store_result();
  $xx = $ret->fetch_all();
} while($conn->next_result());
$mq_end = microtime(true);

echo "Single query: " . ($sq_end - $sq_start)*1000 . " ms\n";
echo "Multi query: " . ($mq_end - $mq_start)*1000 . " ms\n";

The result is following:

Single query: 526.38602256775 ms
Multi query: 1408.7419509888 ms

Note: next_result() will consume 922ms in this case.

1

There are 1 best solutions below

2
On

The simplest answer here is overhead. multi_query() builds a data object for each result set. It then has to store that object over and over.

By contrast, you're running the same query over and over but only writing a simple data array to store the results. PHP can then free the memory of the previous result set because you're constantly overwriting the same variable (once a data object has nothing pointing to it internally it can be garbage collected).

This isn't a very good use case, though. There's no real world application here because there's no need to run the same query over and over (your DB should cache the results the first run and then store the data in memory to faster retrieval). For a single result set, the difference here is negligible.