Types of tables in SQL DW and when to opt - what type

226 Views Asked by At

I have a question related to the types of tables in SQL DW and how to create them.

My understanding - there are 3 types 1 . HEAP STORE [ This is row store] 2. CLUSTERED INDEX [ This is row store with indexed on one column ] 3. COLUMN CLUSTERED INDEX [ COLUMN store with Indexed on one column]

Heap store is ideal for Staging tables. Clustered Index are for dimension tables which are HUGE in volume [ > 10K to 20M+ ] Column Clustered Index is for HUGE transaction tables(>60M records) .

Now- I have a transaction table with 25M records , table contains 255 columns. This table has columns like StartDateKey, EndDateKey, LoadDateKey, CustId, ProdId, SourceId and so on... Transaction table is joined with 40+ columns in SSAS Data model.

In this case, what should be my type of the table ? 1. Clustered Index because its < 60M 2. Column Clustered Index because it has lot of joins happening ?

is my basics on deriving the type of the table with # of records is correct ? or should I consider # of columns, # of joins happening, usage pattern, ?

I expect my transaction table to me queried multiple times a day , 3 times getting loaded ..

Thanks, Aravind

1

There are 1 best solutions below

0
On BEST ANSWER

One of the other indicators for a Clustered Columnstore is tables with a large number of columns. In a Columnstore each column segment is stored on disk, and cached in memory separately. So you can scan just the columns you need, and your cache efficiency is increased. So my guess is that a Clustered Columnstore will be best here, even if each distribution is undersized.

But what's best is always workload-dependent, and so while there are rules of thumb, you should just test the various options. There may be tradeoffs between loading, updating and querying. And in Azure, and with that size database, testing will be quick and inexpensive.

Note that a Clustered Index can have multiple index column, and a Clustered Columnstore is not "indexed on one column" Each column is stored separately, and no one column is "special".