Does adding a primary key cause restructuring of underlying data

Question

Does adding a primary key cause restructuring of underlying data

692 Views Asked by spender At 29 July 2025 at 05:58

I'm importing a fairly hefty amount of data into a SQL Server database. The source data originates from PgSql (including table defs), which I throw through some fairly simple regex to translate to TSql. This creates tables with no primary key.

As far as I understand, lack of a primary key/clustering index means that the data is stored in a heap.

Once the import is complete, I add PKs as follows:

ALTER TABLE someTable ADD CONSTRAINT PK_someTable PRIMARY KEY (id);

(note the lack of CLUSTERED keyword). What's going on now? Still a heap? What's the effect on lookup by primary key? Is this really any different to adding a standard index?

Now, say instead I add PKs as follows:

ALTER TABLE someTable ADD CONSTRAINT PK_someTable PRIMARY KEY CLUSTERED (id);

I assume this now completely restructures the table into a row based structure with more efficient lookup by PK but less desirable insertion characteristics.

Are my assumptions correct?

If my import inserts data in PK order, is there any benefit to omitting the PK in the first place?

Original Q&A

There are 3 best solutions below

Gary Walker On 30 August 2013 at 18:53

In sql server, a primary keys defaults to clustered if no clustered index exists. A clustered index really means that the "index" is not kept in a separate storage area (as is a non-clustered index), but that the index data is "interspersed" with the corresponding regular table data. If you thing about this, you will realize that they can only be 1 cluster index.

The real advantage of a clustered index is that the data is near the index data, so you can grab both while the drive head is "in the area". A clustered index is noticebly faster than a non-clusted index when the data you are processing exhibits locality of reference -- when rows of nearly the same value tend to be read at the same time.

For example, if you primary key is SSN, you do not get large advantage unless you are processing data that is randomly ordered with respect to SSN -- though you do get an advantage due to the nearness of data. But, if you can presort the input by SSN a clustered key is a large advantage.

So yes, a clustered index does reorder the data so that it is comingled with the clustered index.

Niels On 28 June 2017 at 07:41

Thanks for a nice demonstration of the subject !

The conclusions in the above is not wrong, but it shows the structure of the index, and not of the the table. I think the following SQL will show information for the actual table:

select 
    o.name, 
    o.object_id, 
    case 
      when p.index_id = 0 then 'Heap'
      when p.index_id = 1 then 'Clustered Index/b-tree'
      when p.index_id > 1 then 'Non-clustered Index/b-tree'
    end as 'Type'
from sys.objects o
inner join sys.partitions p on p.object_id = o.object_id
where o.name = 'MyTable';

You will see that MyTable is clustered:

name    object_id   Type
------- ----------- -------------------
MyTable 1237579447  Clustered Index/b-tree

**Bogdan Sahlean** · Accepted Answer

When you execute

ALTER TABLE someTable ADD CONSTRAINT PK_someTable PRIMARY KEY (id);

if there is no clustered index on someTable then the PK will be a clustered PK. Otherwise, if there is a clustered index before executing ALTER .. ADD ... PRIMARY KEY (id) the PK will be a non-clustered PK.

-- Test #1

BEGIN TRAN;
CREATE TABLE dbo.MyTable
(
    id INT NOT NULL,
    Col1 INT NOT NULL,
    Col2 VARCHAR(50) NOT NULL
);
SELECT  i.name, i.index_id, i.type_desc
FROM    sys.indexes i
WHERE   i.object_id = OBJECT_ID(N'dbo.MyTable');
/*
name index_id    type_desc
---- ----------- ---------
NULL 0           HEAP
*/
ALTER TABLE dbo.MyTable
ADD CONSTRAINT PK_MyTable PRIMARY KEY (id);

SELECT  i.name, i.index_id, i.type_desc
FROM    sys.indexes i
WHERE   i.object_id = OBJECT_ID(N'dbo.MyTable');
/*
name        index_id    type_desc
----------- ----------- ---------
PK_MyTable  1           CLUSTERED
*/
ROLLBACK;

-- Test #2

BEGIN TRAN;
CREATE TABLE dbo.MyTable
(
    id INT NOT NULL,
    Col1 INT NOT NULL,
    Col2 VARCHAR(50) NOT NULL
);
SELECT  i.name, i.index_id, i.type_desc FROM    sys.indexes i WHERE i.object_id = OBJECT_ID(N'dbo.MyTable');
/*
name index_id    type_desc
---- ----------- ---------
NULL 0           HEAP
*/
CREATE CLUSTERED INDEX ix1
ON dbo.MyTable(Col1);

SELECT  i.name, i.index_id, i.type_desc FROM    sys.indexes i WHERE i.object_id = OBJECT_ID(N'dbo.MyTable');
/*
name index_id    type_desc
---- ----------- ---------
ix1  1           CLUSTERED
*/

ALTER TABLE dbo.MyTable
ADD CONSTRAINT PK_MyTable PRIMARY KEY (id);

SELECT  i.name, i.index_id, i.type_desc FROM    sys.indexes i WHERE i.object_id = OBJECT_ID(N'dbo.MyTable');
/*
name       index_id    type_desc
---------- ----------- ------------
ix1        1           CLUSTERED
PK_MyTable 2           NONCLUSTERED
*/
ROLLBACK;

Does adding a primary key cause restructuring of underlying data

There are 3 best solutions below

Related Questions in SQL

Related Questions in SQL-SERVER

Related Questions in SQL-SERVER-2008

Related Questions in PRIMARY-KEY

Related Questions in CLUSTERED-INDEX

Trending Questions

Popular # Hahtags

Popular Questions