Parallelize oracle union query questions

11.1k Views Asked by At

I have an Oralce query like this:

Sub_query1
Union
Sub_query2;

I want to parallelize the query. I searched on web and found some say UNION cannot be parallelized because the sub queries are running in serial and UNION will not run until two sub queries completed. And they are people say UNION can be parallelized. My question is:

         (1) can a UNION query be parallezied? if yes, how? if no, why?
         (2) can I just parallelize the two sub queries?

I am using Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production

Thank you!

3

There are 3 best solutions below

2
On

By doing some tests and comparing execution plans, I finally figured out a way to parallelize the union like this:

select/* +parallel (Result) */ * from
(Sub_query1
Union
Sub_query2) Result;

By doing this, the time and cpu cost almost half of the serial version. Adding parallel hints to both subqueries doesn't change time and cpu cost.

0
On

I think you are confusing running the two queries at the same time versus running the query in parallel. SQL is a descriptive language, which gets translated into code by the SQL engine/optimizer. This query plan consists of many different components, for retrieving data from a table, doing a join, doing aggregations, and so on.

Oracle produces a query plan for your union query. Each component of the query plan can make use of all available processors (assuming the right conditions are met). However, each component basically runs one at a time (to a reasonable approximation). So, the components of the query are parallelized, although the two subqueries don't run at the same time.

One piece of advice. Whenever you think about using a UNION, you should ask yourself whether or not a UNION ALL would also work. The UNION ALL is much more efficient, because it does not have to remove duplicates on the final result set.

0
On

Yes, as you've already discovered, a UNION query can run in parallel.

To fully understand what's going on here, you may want to read about parallel execution in the VLDB and Partitioning Guide.

Intra-operation parallelism can happen almost anywhere. Inter-operation parallelism only happens between producers and consumers. Which in this case means the UNION (a consumer) can execute in parallel the entire time. Each subquery (the producers) will execute in parallel, but not at the same time as each other.

You can see this happening in the example below, by looking at the active report for the query.

--Create two simple tables
create table test1(a number);
create table test2(a number);

--Populate them with 10 million rows
begin
    for i in 1 .. 100 loop
        insert into test1 select level from dual connect by level <= 100000;
        insert into test2 select level from dual connect by level <= 100000;
    end loop;
end;
/
commit;

--Gather stats
begin
    dbms_stats.gather_table_stats(user, 'TEST1');
    dbms_stats.gather_table_stats(user, 'TEST2');
end;
/

--Run a simple UNION.
select /*+ parallel */ count(*) from
(
   select a from test1 join test2 using (a) where a <= 1000
   union
   select a from test2 join test1 using (a) where a <= 1000
);

--Find the SQL_ID by looking at v$sql, then get the active report
--(which must be saved and viewed in a browser)
select dbms_sqltune.report_sql_monitor(sql_id => 'bv5c18gyykntv', type => 'active')
from dual;

Here's part of the output. It's difficult to read, but it shows how the UNION, the first 11 steps of the plan, runs the whole time. The first subquery, the next 9 lines, runs during the first half of the query. Then the second subquery, the last 9 lines, runs during the second half of the query. active report for parallel UNION query