(SQL Server) Why GUID or other random value as high-order key will cause fragmentation?

444 Views Asked by At

I'm watching Paul S Randal's index fragmentation video at here. In the video, Paul said that GUID or other random values as high-order key will cause page splits which in turn will cause index fragmentation. The reason is that, random values means the new records will be inserted into random pages, which will probably cause page split on random page. My question: if the new records are not inserted into random pages, page split still can occur, right? If this is true, what's the difference in between the two cases?

BTW, does high-order key mean primary key? I'm not a native English speaker and not sure of it.

Thanks.

2

There are 2 best solutions below

0
On BEST ANSWER

The index data is clustered on the keys used. The "high-order" here is mentionned because indexes can be composite, and the first member is the most significant for the order.

So if this member is random (and a GUID has to be seen as random, even if crypto guys don't like this term for GUIDs) you'll get much more frequent splits than if you insert consecutive values, even if the values are not at the end of the range.

(That's also why using NEWSEQUENTIALID() instead of NEWID() helps a lot for the insert performance when using a GUID for a clustered primary key).

2
On

If your keys are sequential, then the data will all be appended at the end, meaning new pages will be created as necessary. This will however create a "hotspot" where data is being pushed in a high volume system. The solution to this is to have a clustered index across some kind of categorisation field, then the incrementing primary key.