SQL Server FOR XML query not returning results in expected hierarchy

170 Views Asked by At

I have a system of four related tables, A, B, C, and D, and I need to get an XML representation of that data. The tables look like this:

A ← B´
↑
B               Both table B and B´ have table A as a parent.
↑               Table C has table B as a parent.
C

Except for table A, which has no parent, each of the tables has 3 fields:

ID     int  -- integer ID of field
XStuff varchar(20) -- Replace 'X' with table name letter, such as AStuff, BStuff, etc.
Y_ID   int  -- Foreign key to parent table. Replace 'Y' with parent table name letter.

Table data:

tblA                 tblB
===============      =============
    ID | 1               ID | 1
AStuff | 'This'      BStuff | 'is'
                       A_ID | 1


tblC                 tblBPrime
============         ====================
    ID | 1                    ID | 1
CStuff | 'a'         BPrimeStuff | 'test'
  B_ID | 1                  A_ID | 1

Query:

SELECT tblA.AStuff
      ,tblB.BStuff
      ,tblC.CStuff
      ,tblBPrime.BPrimeStuff
  FROM tblA 
    JOIN tblB ON tblB.A_ID = tblA.ID
    JOIN tblC ON tblC.B_ID = tblB.ID
    JOIN tblBPrime ON tblBPrime.A_ID = tblA.ID
  FOR XML AUTO, TYPE, ELEMENTS

Which results in the following output:

<tblA>
  <AStuff>This</AStuff>
  <tblB>
    <BStuff>is</BStuff>
    <tblC>
      <CStuff>a</CStuff>
      <tblBPrime>
        <BPrimeStuff>test</BPrimeStuff>
      </tblBPrime>
    </tblC>
  </tblB>
</tblA>

This is confusing because tblBPrime is subordinate to tblA, not tblC, and I'd expect the output to be something like this:

<tblA>
  <AStuff>This</AStuff>
  <tblB>
    <BStuff>is</BStuff>
    <tblC>
      <CStuff>a</CStuff>
    </tblC>
  </tblB>
  <tblBPrime>
    <BPrimeStuff>test</BPrimeStuff>
  </tblBPrime>
</tblA>

I'd like to know how the query should be changed to get the results I expect. By the way, I've tagged this SQL Server 2008 because the oldest SQL Server box that will have to run this code is at that level, but it also needs to work on SQL Server version 2012 and 2014.

EDIT:

As requested, here are scripts to create the tables and their relationships and populate them:

Create tables and relationships:

CREATE TABLE [dbo].[tblA](
    [ID] [int] NOT NULL,
    [AStuff] [varchar](50) NOT NULL,
 CONSTRAINT [PK_tblA] PRIMARY KEY CLUSTERED(
    [ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[tblB](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [BStuff] [varchar](50) NOT NULL,
    [A_ID] [int] NOT NULL,
 CONSTRAINT [PK_tblB] PRIMARY KEY CLUSTERED(
    [ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[tblB]  WITH CHECK ADD  CONSTRAINT [FK_tblB_tblA] FOREIGN KEY([A_ID])
REFERENCES [dbo].[tblA] ([ID])
ON DELETE CASCADE
GO
ALTER TABLE [dbo].[tblB] CHECK CONSTRAINT [FK_tblB_tblA]
GO
CREATE TABLE [dbo].[tblBPrime](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [BPrimeStuff] [varchar](50) NOT NULL,
    [A_ID] [int] NOT NULL,
 CONSTRAINT [PK_tblBPrime] PRIMARY KEY CLUSTERED(
    [ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[tblBPrime]  WITH CHECK ADD  CONSTRAINT [FK_tblBPrime_tblA] FOREIGN KEY([A_ID])
REFERENCES [dbo].[tblA] ([ID])
GO
ALTER TABLE [dbo].[tblBPrime] CHECK CONSTRAINT [FK_tblBPrime_tblA]
GO
CREATE TABLE [dbo].[tblC](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [CStuff] [varchar](50) NOT NULL,
    [B_ID] [int] NOT NULL,
 CONSTRAINT [PK_tblC] PRIMARY KEY CLUSTERED(
    [ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[tblC]  WITH CHECK ADD  CONSTRAINT [FK_tblC_tblB] FOREIGN KEY([B_ID])
REFERENCES [dbo].[tblB] ([ID])
ON DELETE CASCADE
GO
ALTER TABLE [dbo].[tblC] CHECK CONSTRAINT [FK_tblC_tblB]
GO

Populate tables:

SET XACT_ABORT ON;
DECLARE @A_ID int = 1,@B_ID int,@C_ID int
BEGIN TRAN
INSERT INTO [dbo].[tblA]([ID],[AStuff])
  VALUES (@A_ID,'This');
INSERT INTO [dbo].[tblB]([BStuff],[A_ID])
   VALUES('is',@A_ID);
SET @B_ID = SCOPE_IDENTITY();
INSERT INTO [dbo].[tblC]([CStuff],[B_ID])
   VALUES('a',@B_ID);
SET @C_ID = SCOPE_IDENTITY();
INSERT INTO [dbo].[tblBPrime]([BPrimeStuff],[A_ID])
   VALUES('test',@A_ID);
COMMIT TRAN
1

There are 1 best solutions below

0
On

One solution seems to be to use FOR XML PATH instead of FOR XML AUTO. I found some great information on using FOR XML statements in a number of articles written by Jacob Sebastian on SQLServerCentral.com: http://www.sqlservercentral.com/articles/XML/3022/ .

The following query produces the expected results, but depends on having the hierarchy of each element defined directly in the query. I'm hoping that there's a better way:

SELECT tblA.AStuff
      ,tblB.BStuff as [tblB/BStuff]
      ,tblC.CStuff as [tblB/tblC/Cstuff]
      ,tblBPrime.BPrimeStuff as [tblBPrime/BPrimeStuff]
  FROM tblA 
    JOIN tblB ON tblB.A_ID = tblA.ID
    JOIN tblC ON tblC.B_ID = tblB.ID
    JOIN tblBPrime ON tblBPrime.A_ID = tblA.ID
  FOR XML PATH ('tblA'), ELEMENTS;

...outputs...

<tblA>
  <AStuff>This</AStuff>
  <tblB>
    <BStuff>is</BStuff>
    <tblC>
      <Cstuff>a</Cstuff>
    </tblC>
  </tblB>
  <tblBPrime>
    <BPrimeStuff>test</BPrimeStuff>
  </tblBPrime>
</tblA>