Unable to query multiple tables via XML: Error occurred in XML processing

486 Views Asked by At

I would like to get a column_name and table_name for a list from all_tab_columns (which is not a problem till here) and also for each given column I want to go to the original table/column and see what is the top value with highest occurence.

With the query below I get the desired value for 1 example of column in 1 table:

select   col1
from    (SELECT col1, rank () over (order by count(*) desc) as rnk
         from  T1
         Group by col1
         ) 
 where   rnk = 1

now I want something like this:

select  table_name,
        column_name,
        xmlquery('/ROWSET/ROW/C/text()'
            passing xmltype(dbms_xmlgen.getxml( 'select ' || column_name  || ' from (select ' || column_name ||', rank () over (order by count(*) desc) as rnk from ' 
            || table_name || ' Group by ' || column_name || ') where rnk = 1;'))
            returning content) as C
from all_tab_columns
where owner = 'S1'
and table_name in ('T1', 'T2', 'T3', 'T4')
;

but it does not work. This is the error I get:

ORA-19202: Error occurred in XML processing
ORA-00933: SQL command not properly ended
ORA-06512: at "SYS.DBMS_XMLGEN", line 176
ORA-06512: at line 1
19202. 00000 -  "Error occurred in XML processing%s"
*Cause:    An error occurred when processing the XML function
*Action:   Check the given error message and fix the appropriate problem

I make an example. These are my two tables, for instance; T1:

col.1      col.2 col.3
----- ---------- -----
y                m1   
y             22 m2   
n             45 m2   
y             10 m5   

and T2:

col.1 col.2   col.3
----- ------- -----
    1 germany xxx  
    2 england xxx  
    3 germany uzt  
    3 germany vvx  
    8 US      XXX  

so

  • from T1/Col.1 I should get 'y'
  • from T1/col.3 I should get 'm2'
  • from T2/col.3 I should get 'xxx'

and so on.

1

There are 1 best solutions below

12
On BEST ANSWER

The important error in what has been reported to you is this one:

ORA-00933: SQL command not properly ended

Remove the semicolon from the query inside the dbms_xmlgen.getxml() call:

select  table_name,
        column_name,
        xmlquery('/ROWSET/ROW/C/text()'
            passing xmltype(dbms_xmlgen.getxml( 'select ' || column_name  || ' from (select ' || column_name ||', rank () over (order by count(*) desc) as rnk from ' 
            || table_name || ' Group by ' || column_name || ') where rnk = 1'))
                                                                     -------^ no semicolon here
            returning content) as C
from all_tab_columns
...

Your XPath seems to be wrong too though; you're looking for /ROWSET/ROW/C, but C is the column alias for the entire expression, not the column being counted. You need to alias the column name within the query, and use that in the XPath:

select  table_name,
        column_name,
        xmlquery('/ROWSET/ROW/COL/text()'
                           -- ^^^
            passing xmltype(dbms_xmlgen.getxml( 'select ' || column_name  || ' as col from (select ' || column_name ||', rank () over (order by count(*) desc) as rnk from ' 
                                                                            -- ^^^^^^
            || table_name || ' Group by ' || column_name || ') where rnk = 1'))
            returning content) as C
from all_tab_columns
... 

With your sample data that gets:

TABLE_NAME                     COLUMN_NAME                    C         
------------------------------ ------------------------------ ----------
T1                             col.1                          y         
T1                             col.2                          224510    
T1                             col.3                          m2        
T2                             col.1                          3         
T2                             col.2                          germany   
T2                             col.3                          xxx       

db<>fiddle

The XMLQuery is returning an XMLtype result, which your client is apparently showing as (XMLTYPE). You can probably change that behaviour - e.g. in Sql Developer from Tool->Preferences->Database->Advanced->DIsplay XMl Value in Grid. But you can also convert the reult to a string, using getStringVal() to return a varchar2 (or getClobVal() if you have CLOB values, which might cause you other issues):

select  table_name,
        column_name,
        xmlquery('/ROWSET/ROW/COL/text()'
            passing xmltype(dbms_xmlgen.getxml( 'select ' || column_name  || ' as col from (select ' || column_name ||', rank () over (order by count(*) desc) as rnk from ' 
            || table_name || ' Group by ' || column_name || ') where rnk = 1'))
            returning content).getStringVal() as C
                           -- ^^^^^^^^^^^^^^^
from all_tab_columns
... 

As you can see, this doesn't do quite what you might expect when there are ties due to equal counts - in your example, there are found different values for T1."col.2" (null, 10, 22, 45) which each appear once; and the XMLQuery is sticking them all together in one result. You need to decide what you want to happen in that case; if you only want to see one then you need to specify how to decide to break ties, within the analytic order by clause.

I actually want to see all results but I expected to see them in different rows

An alternative approach that allows that is to use XMLTable instead of XMLQuery:

select table_name, column_name, value
from (
  select atc.table_name, atc.column_name, x.value, x.value_count,
    rank() over (partition by atc.table_name, atc.column_name
      order by x.value_count desc) as rnk
  from all_tab_columns atc
  cross join xmltable(
    '/ROWSET/ROW'
    passing xmltype(dbms_xmlgen.getxml(
           'select "' || column_name  || '" as value, count(*) as value_count '
        || 'from ' || table_name || ' '
        || 'group by "' || column_name || '"'))
    columns value varchar2(4000) path 'VALUE',
            value_count number path 'VALUE_COUNT'
  ) x
  where atc.owner = user
  and atc.table_name in ('T1', 'T2', 'T3', 'T4')
)
where rnk = 1;

The inner query cross-joins all_tab_columns to an XMLTable which does a simpler dbms_xmlgen.get_xml() call to just get every value and its count, extracts the values and counts as relational data from the generated XML, and includes the ranking function as part of that subquery rather than within the XML generation. If you run the subquery on its own you'll see all possibel values and their counts, along with each values' ranking.

The outer query then just filters on the ranking, and shows you the relevant columns from the subquery for the first-ranked result.

db<>fiddle