I've hit an interesting problem in our system. A design decision was made to include IDs in a comma separated string which are referenced in the day-to-day functioning of the system. It wasn't causing any problems initially but, now we have a large amount of data, the query to extract information is taking 30-60 seconds. This is mainly down to it checking every row's string for a particular ID and using a function to split each string. I expect that extrapolating this information into an indexed view would speed things up significantly but, although I'm able to create the query to do the work, I can't put it into an indexed view because it uses a cross apply.
The data is currently in this form
| ProductId | OtherProductId |
|---|---|
| 123 | 123, 234, 456 |
| 456 | 987, 987, 876, 512 |
and I need to get it into an indexed view like this
| ProductId | OtherProductId |
|---|---|
| 123 | 123 |
| 123 | 234 |
| 123 | 456 |
| 456 | 987 |
| 456 | 876 |
| 456 | 512 |
I can then join by OtherProductId and get the ProductIds associated.
The following will provide the data in a form I can use but takes 30+ seconds to run
SELECT DISTINCT ProductId,
LTRIM(RTRIM(m.n.value('.[1]', 'varchar(10)'))) AS OtherProductId
FROM (
SELECT ProductId,
CAST('<XMLRoot><RowData>'
+ REPLACE(OtherProductIds, ',', '</RowData><RowData>')
+ '</RowData></XMLRoot>' AS XML) AS x
FROM MySourceTable
) t
CROSS APPLY x.nodes('/XMLRoot/RowData') m(n)
I'd like to convert this into an indexed view so I can join it to my actual query and speed things up.
We're using SQL Server 2008 compatibility mode so are unable to make use of split_string and all the other newer features. Changing the code would require too much upheaval and have far-reaching issues in implementing but I am able to change and update the database as necessary.
Thanks to Thom A for pointing me to this in his comment. The solution I've used is
The query has dropped to around 5 seconds using this. Thanks to everyone for their help!