Does clustered index have a separate index file?

167 Views Asked by At

Do clustered indexes have a separate index file or is it the table itself? I read from my textbook that clustered indexes change the physical ordering of the actual file. So does it need a separate file? Kindly provide a generic answer if possible, as this is from my texbook which states this topic as an abstract concept and not specific to dbms. Textbook referred:Database management systems by raghu ramakrishnan. RDBMS: Postgresql Thanks to everyone who tries to help :)

2

There are 2 best solutions below

3
On BEST ANSWER

Clustered index in the way you explaining is reordering the data in the database file itself.

In the 90's this was widely used in dbase, but it did not help very much after you build it. Bad example is - collegue of mine reordered a table so id 1 as on first row, id 2 on second, id 42 was on row 42. Finally he was able to access the table as array.

This was supported in Informix.

This is also supported in MySQL MyISAM, but there command is - 'alter table x order by name'. However in MySQL, database does not take advantage of the reordering. Only thing is - if you do 'select * ftom x where city=5', results will be ordered by name (notice there are no order by in the SQL).

Note clustered index exists until you insert some data into database.

So Answer - No it does not have additional space (file). However I don't help as much as books suggest.

2
On

PostgreSQL does not have a clustered index, so the question is moot.

PostgreSQL does have a CLUSTER command, but that rewrites a table (and its indexes) so that the table is physically sorted in index order afterwards. The index by which the table is cluatered will not be different from other indexes afterwards, it just provided the ordering.

This physical table order is not preserved, and subsequent table modifications let the ordering rot, so that you will have to run another CLUSTER after a while. An additional disadvantage is that the table is lockes to concurrent reads while CLUSTER is running.

You cluster a table to speed up index range scans: if adjacent index entries point to the same table block, you will have substantially less I/O, and additionally, that I/O tends to be sequential.