I'm struggling with query performance. I'm having "flat" XML stored as XML in table. And I'm trying to extract all elements and attributes of that XML to some flat table. The issue is that I'm having ~200 columns/attributes to parse. This is how it looks like:
SELECT
x.i.value('./@Id', 'nvarchar(max)') as [Id],
x.i.value('(./Elem1/text())[1]', 'nvarchar(max)') as [Elem1],
x.i.value('(./Elem2/text())[1]', 'nvarchar(max)') as [Elem2],
x.i.value('./@Attr1', 'nvarchar(max)') as [Attr1],
x.i.value('./@Attr2', 'nvarchar(max)') as [Attr2],
x.i.value('./@Attr3', 'nvarchar(max)') as [Attr3],
x.i.value('./@Attr4', 'nvarchar(max)') as [Attr4],
x.i.value('./@Attr5', 'nvarchar(max)') as [Attr5],
x.i.value('./@Attr6', 'nvarchar(max)') as [Attr6],
x.i.value('./@Attr7', 'nvarchar(max)') as [Attr7],
x.i.value('(./Elem3/text())[1]', 'nvarchar(max)') as [Elem3],
x.i.value('(./Elem4/text())[1]', 'nvarchar(max)') as [Elem4],
x.i.value('(./Elem5/text())[1]', 'nvarchar(max)') as [Elem5],
x.i.value('(./Elem6/text())[1]', 'nvarchar(max)') as [Elem6],
...
x.i.value('(./Elem200/text())[1]', 'nvarchar(max)') as [Elem200]
FROM [XmlLoad].[DataTable] as t
CROSS APPLY t.[Xml].nodes('./Log') as x(i);
And sample content of XML column is:
<Log Id="1" Attr1="a" Attr2="a" Attr3="a" Attr4="a" Attr5="a" Attr6="a" Attr7="a" >
<Elem1>value</Elem1>
<Elem2>value</Elem2>
<Elem3>value</Elem3>
<Elem4>value</Elem4>
<Elem5>value</Elem5>
<Elem6>value</Elem6>
...
<Elem200>value</Elem200>
</Log>
What SQL server does is it create a nested loop for each element and this gives insane execution plan. It's not a problem when I'm extracting 10 elements but for 200 this is really slow.
I've tried primary and secondary XML index and performance got even worse, I've also tried changing parsing changing to:
x.i.value('(./Log/Elem1/text())[1]', 'nvarchar(max)') as [Elem1],
Without CROSS APPLY but execution plan is the same. And execution plan is... part of execution plan
With full plan looking like this :) full execution plan
Is there some way to improve that? Should I rebuild query somehow?
-EDIT - I have the same issue, and instead of opening a new question, I guess adding a easy to reproduce example may help. In the example we only have 4 fields, but when the number of fields increases, this really becomes problematic:
DECLARE @XML XML;
SELECT @XML='<?xml version="1.0"?>
<CUSTOMER_IMPORT xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<CUSTOMER>
<CUSTOMER_ID>1</CUSTOMER_ID>
<SOMETHING_A>TEST</SOMETHING_A>
<SOMETHING_B>1234567</SOMETHING_B>
<SOMETHING_C>Bla</SOMETHING_C>
</CUSTOMER>
</CUSTOMER_IMPORT>';
DECLARE @CUSTOMER TABLE(
LOOP_ID INT IDENTITY(1,1),
CUSTOMER_ID INT,
SOMETHING_A nvarchar(2000),
SOMETHING_B nvarchar(50),
SOMETHING_C nvarchar(100)
);
INSERT INTO @CUSTOMER SELECT
Customer.value('(./CUSTOMER_ID/node())[1]', 'int') AS CUSTOMER_ID,
Customer.value('(./SOMETHING_A/node())[1]', 'nvarchar(2000)') AS SOMETHING_A,
Customer.value('(./SOMETHING_B/node())[1]', 'nvarchar(50)') AS SOMETHING_B,
Customer.value('(./SOMETHING_C/node())[1]', 'nvarchar(100)') AS SOMETHING_C
FROM @XML.nodes('/CUSTOMER_IMPORT/CUSTOMER') AS A(Customer);