I have a query which works well except that it doesn't return unique values, but repeats some.
If the columns of the query were independent and not part of a string concatenation I am able to add distinct as long as I also add each item in the select to the order by clause, which is what you would expect.
The complication comes when I use a concatenated string (as presented here) because, by design, if I add a distinct to the select, SQL Server will complain that one needs to include each order by item in the select and vice versa, but how does one do that when a string concat via for xml is being done? That is where I am stuck.
I tried various incantations and haven't landed on anything that works with the string concatenation scheme I have presented here.
Unfortunately I must return the data concatenated as presented here as that data must appear all in one column which is later consumed by some SQLCLR code.
So again, the select needs a distinct keyword, but in this current form, SQL Server will need some adjustments before it will accept it.
(Also note that the code is part of a larger query so the alias and the ending comma can be removed and a select added to the beginning and @Warehouses and s.itemcode can be declared as variables in a test environment which is how I've tested various [failed] approaches.)
My sense is that some SQL Server XML skills might be needed or turning this into a subquery with an outer query that somehow picks up the now unique subquery columns and concatenates them.
Any ideas?
coalesce(stuff((select '; ' + convert(nvarchar, t0.docnum) + ': '
+ convert(nvarchar, t0.duedate, 101) + ', '
+ convert(nvarchar, convert(int, t0.plannedqty))
from owor t0
inner join wor1 t1 on t0.docentry = t1.docentry
where t0.status in ('P','R')
and t1.warehouse in (select value from string_split(@Warehouses,','))
and t0.itemcode = s.itemcode
order by t0.duedate
for xml path(''), type).value('text()[1]', 'nvarchar(max)'), 1, 2, N''), N'') as p_pros_data,
distinctis in any case often less efficient when used on calculated columns, as it cannot use indexes.So just use a
group byinsteadIf you have SQL Server 2017+ available, you can use the much more efficient
STRING_AGGfunction. You still need to group separately, because there is noDISTINCTmodifier, it's also more efficient as mentioned.