I'm importing a fairly hefty amount of data into a SQL Server database. The source data originates from PgSql (including table defs), which I throw through some fairly simple regex to translate to TSql. This creates tables with no primary key.
As far as I understand, lack of a primary key/clustering index means that the data is stored in a heap.
Once the import is complete, I add PKs as follows:
ALTER TABLE someTable ADD CONSTRAINT PK_someTable PRIMARY KEY (id);
(note the lack of CLUSTERED
keyword). What's going on now? Still a heap? What's the effect on lookup by primary key? Is this really any different to adding a standard index?
Now, say instead I add PKs as follows:
ALTER TABLE someTable ADD CONSTRAINT PK_someTable PRIMARY KEY CLUSTERED (id);
I assume this now completely restructures the table into a row based structure with more efficient lookup by PK but less desirable insertion characteristics.
Are my assumptions correct?
If my import inserts data in PK order, is there any benefit to omitting the PK in the first place?
When you execute
if there is no clustered index on
someTable
then the PK will be a clustered PK. Otherwise, if there is a clustered index before executingALTER .. ADD ... PRIMARY KEY (id)
the PK will be a non-clustered PK.-- Test #1
-- Test #2